Your average user will think coefficient for treatment tells you something like "the effect of the treatment on the result in this population when controlling for gender". I get a treatment effect of 1.17 in the first case, but -0.38 in the second case, just by switching whether male = 0 and female = 1 or vice versa.
Actually, treatment should stay at 0/1, now that I think about it. You want the intercept of the model to represent the result without treatment. The intercept should be the average of the two genders, though.
The intuitive explanation is that you really are writing down a model of the form
result = a + b*treatment + c*gender
Let's ignore the interaction term for the sake of simplicity. It doesn't change anything in what follows. The two operations I think about for interpreting this are:
1. What equation results when I fix a particular value of a factor? This is equivalent to taking a subset of the population I'm studying.
2. What equation results when I average over a factor? This is equivalent to removing a dependent variable from my model.
If I remove the dependent variable gender and look at the equation when I fix the two values of treatment, what do I want my model to look like? I think that
result = a
for no treatment and
result = a + b
for treatment is the easiest to interpret. Then a is the baseline without intervention, and b is the effect of the treatment. To get those, I make no treatment = 0 and treatment = 1. If I made no treatment = -0.5 and treatment = 0.5, I would get
result = a - 0.5*b
for no treatment and
result = a + 0.5*b
for treatment. I've messed up my interpretation of the parameters. But what contrast do I need for gender to get this? If I use male = 0 and female = 1, then when I average over the whole population (assuming that it's balanced male/female), I get
result = a + 0.5*c
for the baseline of no treatment, averaged across the population and
result = a + b + 0.5*c
for treatment. a and c are now mixed, messing up what I can interpret off the parameters. If I want to keep the baseline a an average over gender, then I need my levels for gender to sum to zero. But what happens if I use, as you suggest, -643 and 643 for male and female? Let's fix treatment to zero and look at the equation for the result for male and for female. For male:
result = a - 643*c
and for female:
result = a + 643*c
All we've ended up doing is scaling the parameter that we interpret as the difference between male and female response by 643. That would be a lot easier to use if the parameter actually measured the difference between male and female directly instead, so let's set it up so that the difference between the two levels is 1, or use -0.5 and 0.5.
>"Let's ignore the interaction term for the sake of simplicity. It doesn't change anything in what follows."
It changes the estimates if you don't assume the interaction coefficient = 0, how can you just ignore it? Actually, not assuming zero interaction changes everything: https://en.wikipedia.org/wiki/Principle_of_marginality