Reading through the report from the ASA (that doesn't really "slam" the VAM statistic but rightly points out the flaws inherent to any attempt to use statistics in areas with many confounding factors), it appears as if the VAM is usually derived thusly:
1. Calculate a regression model for a student's expected standardized test scores based off of background variables (like previous scores, socioeconomic status etc). This includes having teacher's as variables.
2. Use the coefficient for the teacher as determined by the model to determine the teacher's "Value Added" metric.
The weaknesses in such an approach are also spelled out in the report: namely, missing background variables, lack of precision, and a lack of time to test for the effectiveness of the statistics themselves.
What's interesting is that the teacher in question was rated as "effective" the year before. The question becomes whether that was based off of her VAM score that year as well as what the standard error was on her regression coefficient. Unfortunately, the article doesn't mention any of that.
The problem with regression models is, in skilled hands, it's easy to manipulate the results. And that is without even opening up the rats nest that is causality.
For instance, want to raise the R^2, a value foolishly used to characterize how well the model explains? Add more variables. R^2 is monotonically increasing in the number of variables. So, for example, add the first letter of the teachers' middle names as an explanatory variable. R^2 will probably increase a bit.
Is there homoskedasticity? How much? What did they do to reduce it?
What observations are considered outliers and dropped, and who makes that determination?
Or, want to tank a teachers' score? Assuming teachers are added as something like indicator variables, there are lots of techniques to make the standard deviation increase, allowing you to say that at 0 is within the CI of B_{teacher}.
If they are using glmms -- as they probably ought to be -- there's even more room for a skilled statistician to pick outcomes, as more and more of the setup is a judgement call.
Finally, there's an open question of how well the exams were designed and if they accurately measured the student pre and post effect; there's a whole field -- psychometrics -- devoted to testing alone.
1. Calculate a regression model for a student's expected standardized test scores based off of background variables (like previous scores, socioeconomic status etc). This includes having teacher's as variables. 2. Use the coefficient for the teacher as determined by the model to determine the teacher's "Value Added" metric.
The weaknesses in such an approach are also spelled out in the report: namely, missing background variables, lack of precision, and a lack of time to test for the effectiveness of the statistics themselves.
What's interesting is that the teacher in question was rated as "effective" the year before. The question becomes whether that was based off of her VAM score that year as well as what the standard error was on her regression coefficient. Unfortunately, the article doesn't mention any of that.