kitchen table math, the sequel: LA Times: Excellent and terrible teaching found in the data

## Monday, August 16, 2010

### LA Times: Excellent and terrible teaching found in the data

The LA Times is beginning another series of articles about LAUSD, this series based on access they've had to LAUSD's longitudinal test data. Prior articles focused on money, with the Times creating an easily readable database listing all LAUSD employees' salaries. This time, they are focused on teaching.

The first article's intro says "A Times analysis, using data largely ignored by LAUSD, looks at which educators help students learn, and which hold them back."

To accomplish this,
The Times obtained seven years of math and English test scores from the Los Angeles Unified School District and used the information to estimate the effectiveness of L.A. teachers — something the district could do but has not.

The Times used a statistical approach known as value-added analysis, which rates teachers based on their students' progress on standardized tests from year to year. Each student's performance is compared with his or her own in past years, which largely controls for outside influences often blamed for academic failure: poverty, prior learning and other factors.

The article profiles a couple of strong and weak teachers, and apparently, more articles are forthcoming that will do more profiling. It seems that after analyzing the data, the authors went to the classrooms of those in the top decile and bottom decile for student improvement to view the teachers in action.

Miguel Aguilar at Broadous Elementary School is one of the strongest. "On average, his students started the year in the 34th percentile in math compared with all other district fifth-graders. They finished in the 61st."

That's an impressive improvement. I wish I understood enough details of the underlying scoring to know how this relates in standard deviations. Is improving a student one standard deviation when they one below the mean as difficult as improving a student one standard deviation when they are at the mean? Certainly it's not the same effort to move a students from 1 standard dev away from the mean to 2. What assumptions can be made about equal difficulty in movement of scores measured in percentile?

The article repeats what we all know as well: you must raise the bar.

"On visits to the classrooms of more than 50 elementary school teachers in Los Angeles, Times reporters found that the most effective instructors differed widely in style and personality. Perhaps not surprisingly, they shared a tendency to be strict, maintain high standards and encourage critical thinking.

But the surest sign of a teacher's effectiveness was the engagement of his or her students — something that often was obvious from the expressions on their faces."

The article goes on to argue that their analysis shows that excellence in teaching and weakness in teaching matter a great deal.
"Among the findings:

• Highly effective teachers routinely propel students from below grade level to advanced in a single year. There is a substantial gap at year's end between students whose teachers were in the top 10% in effectiveness and the bottom 10%. The fortunate students ranked 17 percentile points higher in English and 25 points higher in math."

The LAT is creating a database for release "in the coming months." I can't wait. I wonder if it will shed light on the value of mediocre teaching?

It's great to see this article. With luck, it will propel other journalists to perform similar studies. Perhaps some enterprising ed bloggers can FOI this information for their district, and perform the same analysis.

Anonymous said...

It think that we need a lot more information about the details of the "value-added" computation than LA Times has told us so far.

Tex said...

The L.A. teachers union is predictably unhappy about too much transparency.

The head of the union said Sunday he was organizing a "massive boycott" of The Times after the newspaper began publishing a series of articles that uses student test scores to estimate the effectiveness of district teachers.
"You're leading people in a dangerous direction, making it seem like you can judge the quality of a teacher by … a test," said A.J. Duffy, president of United Teachers Los Angeles, which has more than 40,000 members.

Union says evaluation of teachers is 'dangerous.' Do you agree?

Allison said...

One element of the article I forgot that I wanted to highlight was that principals didn't know an effective teacher from a noneffective one, either.

"... Karen Caruso stands out for her dedication and professional accomplishments.

A teacher since 1984, she was one of the first in the district to be certified by the National Board for Professional Teaching Standards. In her spare time, she attends professional development workshops and teaches future teachers at UCLA...

Third Street Principal Suzie Oh described Caruso as one of her most effective teachers.

But seven years of student test scores suggest otherwise.

In the Times analysis, Caruso, who teaches third grade, ranked among the bottom 10% of elementary school teachers in boosting students' test scores. On average, her students started the year at a high level — above the 80th percentile — but by the end had sunk 11 percentile points in math and 5 points in English."

In schools here, I often hear principals or deans say that they aren't sure about a teacher--they think one is ineffective, but parents like them, so they don't know what to make of it.

kcab said...

One thing that I found interesting and hopeful in the article, and which is at odds with the union stance, is that the teachers who were profiled appeared to welcome information that would help them improve their teaching/outcomes.

Anonymous said...

Not defending Caruso...but one should also consider what materials she is REQUIRED to use in 3rd grade (if she is required to use a particular program/set of materials). How do her numbers compare with other 3rd grade teachers who receive similarly ranked students? It could be that she inherits students ahead of the pack, but is required to use the same medicore material that is designed for struggling students. If the same trend is found in other like situations, then perhaps the "blame" should be placed on using poor materials, not the teacher.

To be fair, it's worth a look.

JE

Independent George said...

JE - that's a good point, but isn't that a major purpose behind the analysis? To identify the common elements among the high performers, and the low performers?

Allison said...

JE, did you read the article? The teachers were in grades 3-5. She was ranked among the bottom 10% of elementary school teachers in boosting students' test scores. That means she compared poorly compared to other 3rd grade teachers who receive similarly ranked students.

Even if the materials were bad,and even if all of the 3rd graders were bad, and all of them performed worse than every single 4th or 5th grade teacher, she still did worse than about 2/3rds of the other teachers.

By averaging over student gains/losses rather than just their overall achievement, it mitigates some of the issue of ranking of students. Yes, the details matter here, but again, the article didn't look at the middle 8 deciles, just the extremes, and as they noted, the results didn't correlate with SES, initial low ranking of students, or any of the other things that might explain it away.

Linda Seebach said...

Note that Caruso "was one of the first in the district to be certified by the National Board for Professional Teaching Standards." That's pretty much a guarantee that she is committed to the all the wrong-headed notions about teaching that are so often, and rightly, deplored here.

Anonymous said...

Allison - I did not read the article, just what was posted here. I suppose before I posted I should have. Thank you for the clarification.

A friend of mine who teaches 2nd grade at a private school uses the same math materials that my daughter's school uses. She didn't think the material was a terrible as I did, but then she added her view that it's not so much the material, but the person who is using the material that matters most. I suppose that this study is getting at that point.

Been reading here for years. Thanks for all of the interesting things that get posted.

JE

Allison said...

Bad materials sink mediocre teachers. They sink teachers new to the field as well. They hamper good teachers. The very best teachers probably find a way to work around them, supplementing, revamping, doing what they can to bypass the issues, but again: who is such a teacher? Do we recognize him or her?

Bad materials are just a piece of the pie. Teachers need training both in the content they teach and the pedagogy of how to do that well. Assessment needs to assess the right things. This is a system where just moving along any one axis isn't going to be enough--we have to make progress on all fronts.

That said, *terrible* teachers need intervention or removal. I don't know what it takes to fix a terrible teacher, but these are individuals who unknowingly are doing great damage, and bad or good materials won't save them.

One interesting aspect in the article is that most people don't know how to evaluate what's good very well--or what works, so materials alone can't help.

This isn't just a problem in teaching. Interviewing prospective hires is something routinely done, even though there's no evidence that the interview process picks the best individual. There's good evidence that people make terrible mistakes when relying on their own personal judgment.

This kind of value added work probably can't tell us much about the teachers in the middle, and it doesn't fix the issues with curricula, which are enormous. But terrible teachers should be low hanging fruit, right?

Cheryl said...

I think it's pretty disgusting that the L.A. Times would name names and print pictures based on an evaluation system that is controversial at best. In fact, it's disgusting that they'd do that based on anything. These aren't dishonest DWP employees sneaking off to strip clubs during working hours. They're hard-working people, and even if they're not the top performers, they do not deserve to be publicly humiliated. People like to toss around "teachers union" in a way that dehumanizes the people who are actually working every day, teaching our kids.

Materials matter and students matter. I am a teacher, not LAUSD, but CA. Let me give you a few examples of why "value added" doesn't make the ridiculous idea of evaluating teachers based on standardized tests any better.

Some teachers are known to be good with a certain type of teacher. I have a reputation of being an excellent teacher for squirrely upper-elementary age boys and ADHD kids. Those kinds of kids can perform well in class, but may slip on high-stakes test as they get older, because the tests require them to sit still and concentrate for 2 hours at a time, in silence, without moving or bathroom breaks. I advocate not medicating these kids except in extreme circumstances, which makes the long testing periods even worse. But my students aren't robots.

Or, maybe I'm a teacher who's good with awkward, low-performing students. And I'm good at getting them (finally) identified for resource (special ed) help, something that can't always happen prior to 4th grade because of the requirement that they be 2 years behind their peers before testing. So, I get those kids identified and helped, and they take the alternate test to the CST, designed for Resource kids. And even though I've raised their score from Far Below Basic to Proficient, their score now is an AUTOMATIC Far Below Basic for my school and me because they took a modified test.

Both examples of where being recognized by administration as an EFFECTIVE teacher for hard-to-teach kids can get me in trouble if my performance is judged by a single high-stakes test.

Maybe the AC doesn't work in my room, never has worked right. The kids in my room swelter in the heat, having a hard time concentrating, while this kids next door sit in a more comfortable room.

There are so many variations of this. I've had years of massive improvement, years my kids have slipped. Same teacher, same curriculum. They're not widgets, they're kids.

Allison said...

I'm a little confused. People are judged all of the time by single tests, by high stakes tests, by their work product. Why are teachers supposed to be exempt from this? Because they mean well?

Doctors have to pass their boards. Lawyers have to pass the bar. Neither gets to practice for 7 years with children relying on them every day until those scores are read.

But I'm even more confused by what's so humiliating: either the "judging by a single high stakes test" is wrong, in which case, what's so humiliating about talking about it, or the data showed some truth? How can you think this article was disgusting if you don't think the analysis they did counts?

Your attacks are against straw men. The article neither used only a "single high stakes" test nor advocated such. The writers went and visited the classrooms and reported what they saw, including how the top teachers differed from the low performing teachers. They included quotes from principals and parents in support of these teachers.

Your claim about why student selection matters is only accurate if student selection isn't random.

If that's true (and there's little reason to believe that, because all of the issues you bring up can be controlled for in the data, or be shown to have large correlations with the results), then okay, let's have THAT opened up to the parents so they can see just how nonrandom it is. Give them transparency into that choice, and then maybe there would be more context for parents to evaluate this analysis.

Bostonian said...

Allison, I agree with most of what you wrote, but I do have some sympathy for the poor-performing teachers who were mentioned by name. They didn't seem like bad people to me, and now their job performance is proclaimed to the whole world as being poor. If I do a bad job I expect to be fired but not to have my failure splashed on the front page of the Boston Globe.

Allison said...

I have some sympathy too.

But the outrage that is directed at the LA Times is so disingenuous. Where is the outrage at the district for having this data and not using it to provide intervention for the teachers who need it? Where is the outrage at the unions who habitually block any attempt to move terrible teachers from classrooms, and prevent them from ever being let go? Where is the outrage at the teachers for not telling their unions that they want this data to improve themselves, that they want the bad teachers culled out so their profession improves, and that they want help finding out what does work?

Where is the outrage at what the students are put through?

Newspaper articles have lots of collateral damage. That's terrible but a fact of life. The children are the real collateral damage here. There should be more outrage directed toward getting them what they need.

Cranberry said...

He sees many problems with the value-added method, as practiced by the Los Angeles Times. As a practical matter, however, he also writes that the unions' stonewalling on teacher evaluations has brought us to this pass:

I have said before that if teachers didn’t take on the job of evaluating teachers themselves, someone else would do the job for them. The fact that the method is they are using is inadequate is important, and should be pointed out, but it’s not enough.

No one knows better than teachers how to evaluate teachers. This is the time to do more than cry foul. This is the time for the teacher’s unions to make teacher evaluation their top priority. If they don’t, others will.

Now, it may be too late.

Barry Garelick said...

Rick Hess, in his blog
expreses the same concern as Bostonian:

Would it have really been such a compromise to have kept teacher names anonymous and to have reported scores by school, or community, or in terms of citywide distribution?

Anonymous said...

Allison wrote: I'm a little confused. People are judged all of the time by single tests, by high stakes tests, by their work product. Why are teachers supposed to be exempt from this?...Doctors have to pass their boards. Lawyers have to pass the bar.

The difference is that the teachers are being judged by a single test that *other* people take. Doctors passing their boards and lawyers passing the bar is more akin to teachers passing their certification testing. In the case of doctors, a similar test might be to see how their patients fare over the course of their treatment. Unfortunately, a doctor can't control whether the patient follows his/her advice. Similarly, a teacher can't force students to pay attention and do their homework.

SteveH said...

Modern educational thought precludes the ability to create a test that is authentic. Everyday Math is based on having teachers push through the material and trusting the spiral. This approach makes no distinction between whether a student is ready for the material or whether there is a problem with the teacher. If you keep spiraling through the material and the student still doesn't learn, then they assume that the problem is with the student ... by definition.

NCLB at least forces states to define some minimal level of learning for each grade. Breaking out results by teacher is quite a reasonable thing to do. I hope the naming names issue won't distract people from the task of, as Allison put it, picking off the "low hanging fruit". Some would like to believe that there is no reasonable way to quantify the effectiveness of teachers. How about letting parents decide ... with their feet.

Unfortunately, this sort of analysis might, like NCLB, plug up some of the low end cracks, but it will do nothing for the top end of learning. Now that my son is heading into high school, I can see how his school is shifting funds from the high end to the low end. The school knows that parents will always support the top end kids. AP classes may be in the catalog, but they are not offered as often. They replace them with the great opportunity (?) to take community college courses.

SteveH said...

Doctors and laywers can be sued. There is a basis for making this judgment. Is teaching so vague a practice that nobody can make any sort of judgment at any level? If you are a bad doctor or lawyer, you can be disbarred, lose your license, or lose your customers. Parents don't have choice when it comes to schools. Teachers can't have it both ways; no quantification of effectiveness and no choice for parents.

Cranberry said...

Will all parents demand teachers whose students advance in peer rankings? I'm not convinced they will. I know of a superb teacher, whose students make enormous strides during the school year. His classes perform exceedingly well on state exams. In every year, more than 50% score "advanced." And yet, if they were given a choice, I know of many parents who would prefer their students were placed in other teachers' classes. The superb teacher is strict. He sets high standards, and he even, (gasp!) gives students Cs if they have not mastered the material. I would choose him for my kids, but many parents would rather have As on the report card, with proficient on the exams, than have Bs on the report cart, with advanced on the exams.

Allison said...

Cranberry,

Willingham's comments would have been better if he'd waited until he read the RAND paper.

*All* of his issues are present in the model. That is, they are accounted for elsewhere, and the data would have shown strong effects for those variables if it had been true.

Now, maybe Willingham really doesn't know enough statistics to understand what modeling can do. Maybe he just assumed the modeling was done poorly, or maybe he recited things that he thought were true, because he thinks he knows more than he does. But here are some excerpts from the report that specifically address Willingham's supposed issues.

"In this paper, we start with a general dynamic panel data model that includes student and teacher fixed effects in the following reduced form:
T_{it} = T_{it-1}λ + x_{i1}β_1 + u_i η + q_j ρ + α_i +φ_j +ep_{it}
where T_{it} is either the English Language Arts (ELA) or math test score for student i in year t; are time-variant individual observable characteristics (classroom characteristics); are time-invariant individual observable characteristics (gender, race, parent’s education, special attitudes and needs); andare time-invariant observable characteristics of the jth teacher (gender, education, experience), and λ indicates the persistence of prior-year learning. The model includes individual student and teacher fixed effects (αi and φj). Finally, εit contains individual and teacher time variant unobserved characteristics."

Both teachers and students enter and exit the panel so we have an unbalanced panel. Students also change teachers (generally from year to year). This is crucial, because fixed effects are identified only by the students who change teachers. It is assumed that εit is strictly exogenous. That is, student's assignments to teachers are independent of εit. Note, according to this assumption, assignment of students to teachers may be a function of the observables and the time-invariant unobservables."

Teacher heterogeneity (φj) is probably correlated with observable student and teacher characteristics (e.g., non-random assignment of students to teachers). Therefore, random effect methods are inconsistent, and the fixed teacher effects are estimated in the model. The fixed teacher effects are defined as ψj=φj+qjρ."

Catherine Johnson said...

Is improving a student one standard deviation when they one below the mean as difficult as improving a student one standard deviation when they are at the mean?

A few years ago, a teacher at a highly successful charter school told me that raising high-scoring kids' achievement substantially is harder than doing so for lower-scoring kids.

She was talking specifically about kids in high-end suburban school districts.

I had always thought such kids were 'easier' to teach --- but she said they were harder!

She teaches in a college preparatory charter school with lots of middle class kids, fyi.

Catherine Johnson said...

One thing that I found interesting and hopeful in the article, and which is at odds with the union stance, is that the teachers who were profiled appeared to welcome information that would help them improve their teaching/outcomes.

I've only skimmed the article, but that jumped out at me, too.

These teachers need Richard DuFour & his "professional learning communities."

The entire point of a 'PLC' is to give teachers a way to measure their students' learning compared to the achievement of students in their colleagues' classes **and adjust instruction accordingly.**