JONATHAN CHAIT FEBRUARY 8, 2011
-
Read Later
READ LATERAvailable only to subscribers. SUBSCRIBE TODAY
-
Listen
ARTICLE AUDIO
- Font Size
One of the hot debates in education policy centers around the use of quantitative measures to evaluate teachers -- measuring student achievement and tying teacher compensation, in some way, to that. Jim Manzi has some interesting, though flawed, thoughts about the problems with such measuring systems:
I’ve seen a number of such analytically-driven evaluation efforts up close. They usually fail. By far the most common result that I have seen is that operational managers muscle through use of this tool in the first year of evaluations, and then give up on it by year two in the face of open revolt by the evaluated employees. This revolt is based partially on veiled self-interest (no matter what they say in response to surveys, most people resist being held objectively accountable for results), but is also partially based on the inability of the system designers to meet the legitimate challenges raised by the employees.
Here is a typical (illustrative) conversation between a district manager delivering an annual review based on such an analytical tool, and the retail store manager receiving it:
District Manager: Your 2007 performance ranking is Level 3, which represents 85% payout of annual bonus opportunity.
Store Manager: But I was Level 2 (with 90% bonus payout) last year, and my sales are up more than the chain-wide average this year.
DM: [Reading from a laptop screen] We now establish bonus eligibility based on your sales gain versus the change in the potential of your store’s trade area over the same time period. This is intended to fairly reflect the actual value-added of your performance. We average this over the past three years. Your sales were up 5% this year, but Measured Potential for your store’s area was 10% higher this year, so your actual value-added averaged over 2005 – 2007 declined versus 2004 – 2006.
SM: My “area potential” increased 10%? – that’s news to me. Based on what?
DM: The new SOAP (Store Operating Area Potential) Model.
SM: What?
DM: [Reading from a laptop screen] “SOAP is based on a neural network model that has been carefully statistically validated.” Whatever that means.
[Continues reading] “It considers such factors are trade area demographic changes, competitor store openings, closures and remodels, changes in traffic patterns, changes in co-tenancy, and a variety of other important factors.”
SM: What factors are up that much in my area?
DM: [Skipping to the workbook page for this specific store, and reading from it] A combination of factors, including competitor openings and the training investment made in your store.
SM: But Joe Phillips had the same training program in his store, and he had no new competitor openings – and he told me that he got Level 2 this year, even though his sales were flat with last year. How can that be?
DM: Look, the geniuses at HQ say this thing is right. Let me check with them.
[2 weeks later, via cell phone]
DM: Well, I checked with the Finance, Planning & Analysis Group in Dallas, and they said that “the model is statistically valid at the 95% significance level “ (whatever that means), “but any one data point cannot be validated.”
[10 second pause]
Let me try to take this up the chain to VP Ops, and see what we can do, OK?
SM: Whatever. I’ve got customers at the register to deal with. [Hangs up]
That's an interesting insight into the general problem with quantitative measures. Here are a few points in response:
1. You need some system for deciding how to compensate teachers. Merit pay may not be perfect, but tenure plus single-track longevity-based pay is really, really imperfect. Manzi doesn't say that better systems for measuring teachers are futile, but he's a little too fatalistic about their potential to improve upon a very badly designed status quo.
2. Manzi's description...
evaluating teacher performance by measuring the average change in standardized test scores for the students in a given teacher’s class from the beginning of the year to the end of the year, rather than simply measuring their scores. The rationale is that this is an effective way to adjust for different teachers being confronted with students of differing abilities and environments.
..implies that quantitative measures are being used as the entire system to evaluate teachers. In fact, no state uses such measures for any more than half of the evaluation. The other half involves subjective human evaluations.
3. In general, he's fitting this issue into his "progressives are too optimistic about the potential to rationalize policy" frame. I think that frame is useful -- indeed, of all the conservative perspectives on public policy, it's probably the one liberals should take most seriously. But when you combine the fact that the status quo system is demonstrably terrible, that nobody is trying to devise a formula to control the entire teacher evaluation process, and that nobody is promising the "silver bullet" he assures us doesn't exist, his argument has a bit of a straw man quality.
7 comments
Performance evaluation inevitably ends up with some subjectivity inserted into it - oherwise really competent people fail while really incompetent people succeed. An important aspect of managing is in handing the most difficult tasks to the best people. In a purely objective scenario your best teachers will only ever want to teach the very best students to protect their outcome. How then do you make sure that your least performing students benefit from your very best teachers if you don't somehow account for the starting point. No sales manager worth their salt would expect that their best salespeople will want to be measured against others when each year they get the worst prospects because they are the most likely to bring in the new customer. The issue always will be how objective a subjective evaluation can be. That, also inevitably, is in the hands of competent and incompetent managers. Many have attempted to eliminate bias through various means to no great avail. That is why manager evaluation always must take into account their success in evaluation and retention of others.
- Zachsteph
February 8, 2011 at 4:52pm
zach: In a purely objective scenario your best teachers will only ever want to teach the very best students to protect their outcome. Nicely said, what teacher in their right mind would go to an innercity school when the suburbs can offer better pay, better security, and a better work environment. "evaluating teacher performance by measuring the average change in standardized test scores for the students in a given teacher’s class from the beginning of the year to the end of the year, rather than simply measuring their scores. The rationale is that this is an effective way to adjust for different teachers being confronted with students of differing abilities and environments." This is not bad but speaking as a teacher every class is different, some classes catch on quickly other are much slower. Essentially this will lead me to teach to the test and test taking strategies. I always teach to the class, not to where the students should be strictly according to any developmental chart. If they don't get something I devote more time to it. If they get it quickly I move on. Sometimes I am able to advance further with one group than I am with another group. I do keep it within the range of the set curriculum and those that fall beneath it must repeat but there is no way I can replicate a standard progression for each class, much less year on year. Students are not widgets. For some classes I have great success, other classes do terrible, same year same basic material but different students. And my pay is going to be based on this year by year? Nah, I would simply quit because I am not the type to teach the test so I can get a bonus. The best evaluation is observation and awareness but this requires far more administration.
- blackton
February 8, 2011 at 6:12pm
You're basically right. It can be very hard to construct a good test of teacher quality that's easy, fast, and cheap, but some testing can be a lot better than nothing -- and it's worth spending the money anyway to go beyond simple fast multiple choice testing. It is possible to test for oh so valuable intuition, even in multiple choice, and especially if you do essay and allow a little bit of non- simple-minded formuleacness (what's called "objective" these days) in the grading. And some subjects are much more easily tested for results than others, like basic math, reading, and writing. A good system is possible, but it takes going beyond the simple-minded, what's called "objective" mindset.
- RHSerlin
February 8, 2011 at 6:15pm
Jim Manzi's arguments do have a straw man quality to them. Back to the keyboard, Jim.
- liberalref
February 8, 2011 at 9:16pm
I just completed my observation based on our state's new teacher evaluation form. (I think it is part of the "race to the top" grant.) It felt like "let's improve teaching by having teachers & principals fill out different forms". Needless to say we've seen several of these over the years. It's funny that there is always a controversy about how to evaluate teachers, but kids and parents always know who the good teachers are.
- s.trabka@frontier.com-old
February 8, 2011 at 10:12pm
The main problem is that most proposals only call for teachers to be evaluated, not administrators (let alone the school system as a whole). Ask anybody whose kids go to a school that has dramatically deteriorated - or dramatically improved - and they'll tell you that the rot (or improvement) had a lot more to do with the principal and administrators than with the aggregate teacher performance. Any fair teacher evaluation system has to be complemented by some mechanism to evaluate quality level of the principals and administrators, and to factor those administrator ratings into the baseline for teacher performance.
- NoHomers
February 9, 2011 at 12:02am
Adding on to NoHomers comment: I understand the trepidation teachers have of giving up a degree of job protection. It opens the door for dismissal based on personal or budgetary considerations (the most senior teachers typically have the highest salaries). And effectiveness evaluations can have arbitrary factors built into them over which the teacher has no control. But I think a possible solution is to hold principals accountable. If a principal knows that getting rid of an effective teacher wills show up in the school's performance, and the principal is subject to sanction or dismissal for poor performance, then the principal will have an incentive not to replace that teacher with a new, untried applicant. So the problems may not be one of how to measure effectiveness, but how to implement accountability at all levels: teachers, principals, superintendents.
- dsimon
February 9, 2011 at 11:45am