kitchen table math, the sequel: School performance info - your predictions?

Thursday, November 29, 2007

School performance info - your predictions?

I wanted to share information on an interesting experiment going on in Sumner County, Tennessee, and solicit predictions.

One of my clients, an organization called the Education Consumers Foundation, has been doing a lot of work with value-added data in Tennessee. (For those unfamiliar with value-added, it basically allows you to isolate and measure schools' contribution to student learning, regardless of the socioeconomic, etc. characteristics of their students.) Tennessee has the most advanced value-added system in the country and has been collecting data since the early 90s, so there's a rich pool of good data available from the state here.

While Tennessee offers great information on the performance of its elementary and middle schools, we've found that almost no one is aware of this data or understands what it means. So we've launched a pilot initiative to see what happens once parents and other community members are given this information.

Last Monday, we sent a package of information to every household in Sumner County - 55,300 pieces in all - containing an introductory letter and a brochure with school-by-school information on their value-added performance rankings, TCAP proficiency rates, and free/reduced lunch rates (a proxy for poverty). I've posted these materials in PDF format: you can download the letter here and the brochure here.

IMHO, I think this is powerful stuff, but we're just at the very beginning stage of the pilot. I'd like to think it's going to make a huge impact, but it's entirely possible that it will be ignored.

I'd love to hear thoughts on this initiative from other KTM folks, and your predictions for what will happen in Sumner County. Will the community at large ignore this? Is it likely to create a significant conversation in the community? Will people push their local schools to improve?

Essentially - is this the path to real school improvement, or is this mailing going to fall on deaf ears?

If anyone's interested, I'll share updates as we go along - as I mentioned, the mailing just dropped last week, so it will take a while for this to all play out.

(For clear and complete disclosure purposes, I'll repeat that ECF is a client, but I don't feel any conflict of interest in posting about this initiative - they're a nonprofit group and have nothing to gain from this. If there are any issues, I'll trust in Catherine to remove this post.)

13 comments:

SteveH said...

I have always thought that parents need to see information in its most simple, direct form. For example, our schools crow about their "High Performing" status, but parents never see exactly what that means. I've tried to trace the scores from an actual test to the final grade and assignment of rating and it can't be done. All along the way, bad results get massaged into nice words. Results are not given in absolute or international standards, but in relative terms. Ten percent better on horrible results is seen as a major achievement.

The goal is to get as many kids as possible over a very minimal cutoff. The rating is based on this goal, but parents think the rating has to do with the overall quality of the education. It says no such thing.


I think you have to eliminate all of the middle crap and show parents the actual test questions and raw results. NAEP is good in that I can look at each question and see how many got the question right or wrong, broken down by category. I don't want to see very massaged data that refers to relative (or value-added) improvement.

"... value-added performance rankings, TCAP proficiency rates, and free/reduced lunch rates (a proxy for poverty)."

All of this sounds like massaged data far away from the raw data. Your report gives results as "Math Achievement Gains (NCEs)". What formula gives you this number from the raw test scores? A parent looking at the results would have no basis for understanding what these numbers mean. All they can do is compare numbers. That's not very useful, and that's how some schools justify the use of Everyday Math.

As a parent, I want to know how many kids don't know what 6 times 7 is in fifth grade. I'll use my own value-added judgment, thank you.

As for what parents will do with this information? I don't know. Our state gives out all sorts of information each year, but most parents just look at the final rating: "High Performing". Parent-teacher groups will quibble or argue about fluctuations in final, massaged scores, but there are always mitigating reasons.

Finally, what CAN parents do, anyway? Form parent-teacher groups to say "Yep, we got a issue with math problem solving in 4th grade. Let's work a little harder." It will NEVER mean that schools will give up or share their control over curriculum.

PaulaV said...

Many suburban parents are in extreme denial. I wish I had a dollar for every time I heard someone say, "but our test scores are so good." Those of us who try to find out what results truly mean are branded "too competitive" or "looking for trouble".

People see what they want to see and believe what they want to believe. No amount of data... good, bad or indifferent will change this because if it did we wouldn't be having this discussion year after year.

We just keep spinning our wheels. I truly believe it is a cultural mindset, and I don't know how one goes about changing this. But, alas, I don't feel we should stop trying.

Brett Pawlowski said...

Hi Steve,

Thanks for your comments. I agree that one of the problems with reporting on standardized test results is the temptation to focus on the "bubble kids," pushing the ones close to the bar just over it in order to show progress. Value-added scores are a remedy to this, since achievement gains are reported for all students in a school - you can't trick the system by concentrating on a handful, since the achievement of the rest will more than counter that limited progress.

FYI, the data is not massaged - we get it straight from the state and don't do any interpretation. The TCAP proficiency scores and the free/reduced lunch rates come straight from the state's report card, and the school performance charts are simply a ranking of the TVAAS data reported by the state.

In answer to your question on what a Normal Curve Equivalent is, it’s defined as follows:

A test score reported on a scale that ranges from 1 to 99 with an average of 50. NCE’s are are approximately equal to percentiles. For example, an NCE of 70 is approximately equal to or greater than 70% of its reference group. Assuming a normally distributed population, plotting the distribution of scores will result in a bell shape commonly known as a bell curve.

The numbers referenced in the letter indicate that Tennessee districts are improving incrementally against their baseline (the average value-added gain in 1998). And we would have preferred to stay away from confusing terms, but this is how the state reports value-added gains. (Again, no massaging on our part.)

Note that these are not referenced in the brochure, which just shows a ranking of schools by value-added performance. It’s not an absolute number you can hang your hat on, but it’s valuable to know that your child’s school contributes more to student learning than 95% of other schools in the state.

Finally, you state: "As a parent, I want to know how many kids don't know what 6 times 7 is in fifth grade. I'll use my own value-added judgment, thank you."

You're right that parents need to see what their children have learned - that's what the standardized assessments show you. But value-added is a different thing – I think it would be extremely hard to get an accurate reading based on an observation of one’s own child.

Value-added, at least in Tennessee, works like this: you look at a student’s performance on a 2nd grade test and then on a third grade test. From these, you can plot a trajectory to estimate where he/she will perform on the 4th grade test. (This all assumes a strong and integrated set of annual assessments, which Tennessee has.) If he/she does better than expected, the school is credited with the unexpected gain; if he/she does worse than expected, the school is credited with that as well.

Notice that this approach eliminates all the usual suspects (socioeconomic status, etc.) that often drive performance on standardized tests: a student performing far below grade level will still have a trajectory, and you can see how the school impacts that.

So, in essence, the value-added data is telling you how the school is accelerating or limiting your child’s academic progress versus how he/she would do in a “typical” school – again, very hard to ascertain that on your own, looking at your own child individually.

Thanks again –

Brett

Doug Sundseth said...

As a parent, I'm interested in two basic pieces of information about a school:

Overall achievement level - This is the information that NCLB sort of requires the schools to report and it's what Steve wants. It's easy to find (after NCLB) but difficult to interpret because of moving "norms".

Value-added/time - This is what Brett's group is delivering. I want to know how a school is doing relative to other, similarly situated schools.

All of this is predicated on my ability to choose the school that my child will be going to, of course. Even in the absence of that ability, however, value-added assessments give parents a weapon in discussions with school boards:

"Why can't you do as well as the schools in the Oz district? Their kids are poorer than ours, but they're learning faster. Are they using a better curriculum or are you just incompetent?"

It should be particularly useful in school-board elections for non-incumbent candidates. And an activist school board should be able to address some of these problems; all it needs is the will.

Catherine Johnson said...

Can't wait to read.

Have you seen the article on voters & school accountability in the new Ed Next?

I haven't read yet, but apparently the conclusion was that voters did not vote "for" accountability...

Actually, I think I'll find it and post.

Catherine Johnson said...

Please DO post updates - I would really appreciate it.

The question of what parents do and do not want to know about school quality is an interesting one.

Catherine Johnson said...

STANFORD -- A new study published in the winter 2008 issue of Education Next finds that voters may be failing to hold school board members accountable for student performance on tests mandated by No Child Left Behind. According to research conducted by the University of Chicago’s Christopher R. Berry and William G. Howell, a demonstrated ability to improve test scores may have little bearing on the reelection rates of incumbent school board members.

Voters Don't Hold School Boards Accountable for Student Learning, Study Finds

SteveH said...

"FYI, the data is not massaged ..."

Perhaps I shouldn't use massage. Let's just say that (in our state, at least) the road from the raw score to the final number is not linear, and information is lost.

I see a NCE value of 3.6 in your letter, but there is only a vague explanation of what it means. It has meaning only in some sort of relative sense, and I have no idea how to calibrate that amount. What does it mean to have a NCE of 3.6 versus a NCE of 1.8? All I can see is whether the number goes up or down. I don't know if it's a big problem or a small problem. Rather than giving me the full graph of the data, you're giving me only the slope of the line, and I don't know what the steepness of the slope means.


"It’s not an absolute number you can hang your hat on, but it’s valuable to know that your child’s school contributes more to student learning than 95% of other schools in the state."

Not if your school is using Everyday Math along with most other schools. This technique completely ignores systemic problems. Are you saying that the only goal is to make sure you're going in the right direction?

I know all about measures of merit and objective functions. There is a big danger in boiling down complex systems into one number. Information can be lost. In this case, it's worse. All absolute information is lost. It might seem great if a company is increasing its profit by 30 percent a year, but not if the company only had a profit of $1000 in the base year. No one in their right mind would use percentage of change in profit as a measure of merit. You would use something like profit as a percentage of gross sales. There has to be some sort of absolute scale used in your measure of merit.


"...again, very hard to ascertain that on your own, looking at your own child individually."

I never said anything about looking only at one child. I'll give you an example. Let's say that a fifth grade test included 100 multiplication table problems. I should be able to go online and see the actual questions and the cumulative raw results. I'm not going to be happy to see a positive slope of the results (from year to year) if the raw score indicates that the average number of correct answers was only 50 percent. This absolute information is there in the data. Don't distill it down to some uncalibrated slope.

You could break down a math test into different sections and report the results as percent correct year-by year. You can find some of this data in our state if you dig deep enough, but that information is lost by the time you get to their final measure of merit.

I went to one open house long ago where the school was talking about their test results and everyone was studying just the slope of the data. I was the only one looking at the actual questions in the test. They were astoundingly simple but nobody was looking at the raw data.

Catherine Johnson said...

Brett - if you're around - how does value added deal with tutoring and parent reteaching?

I understand how value added subtracts out SES, but I don't see how value added can subtract out parent remediation of weak teachers.

If Teacher X has 10 kids being tutored while Teacher Y has no kids being tutored, doesn't that affect the apparent quality of Teacher X as compared to Teacher Y?

Doug Sundseth said...

In social science, it's nearly impossible to conduct any study of significant size without numerous simplifying assumptions. Sometimes one of those will turn out to make the results of the study inaccurate, of course.

It seems (at first blush, anyway) reasonable to assume that tutoring in one population is likely to be relatively comparable to tutoring in another population with the same demographics. It may be reasonable to use SES as a proxy for demographics. While not always true, I'd be willing to make those assumptions ab initio.

I would place the burden of proof on those who would claim that statistically significant variations in outcome between large groups of students from similar communities anywhere else than on those students' schools. (I believe the legal term of art is "rebuttable presumption".)

Brett Pawlowski said...

Steve - I appreciate what you're saying, and I agree that value-added doesn't give you a window into specifically what children know. That's the function of the state assessments - I've heard people explain it by saying that state assessments are the mile markers, while value-added is the speedometer.

And you're right on the money in saying that progress is relative - I wish I knew a way around this. Being the best of a lousy lot would still be lousy.

Catherine - Value-added cannot factor in the impact of non-school influences that are introduced midstream. Therefore, improvements due to the introduction of tutoring, or heavier parental support, would show up as gains for the school. Similarly, other influences, like extended illnesses, would also impact the school's scores.

One thing that helps with this level of value-added analysis, however, is the fact that it includes the scores of every student in the school (including ESL and special education - anyone who takes the state assessment). If we're only talking about a few kids who get outside help, it won't move the school's score much.

Catherine Johnson said...

Brett--

Thanks!

I've been trying to figure this one out forever.

I'm very keen on value-added....but I'm worried how well it will work in "high-performing" districts like mine where you off-load so much of the responsibility for achievement to the parents and students.

I'll email you the Richard Elmore PowerPoint Concerned Parent found.

SteveH said...

"I've heard people explain it by saying that state assessments are the mile markers, while value-added is the speedometer."

I would call it the y-intercept and slope.


Your original question was:

"Essentially - is this the path to real school improvement, or is this mailing going to fall on deaf ears?"

I was trying to explain that parents need some way to understand the scaling of that graph. One person might scale the y-axis so the line looks flat, while another might make it look steep. There has to be a direct and simple corelation between test results and the graph.

In our state, the math test (designed and calibrated by teachers) uses some sort of technique for deriving attributes like problem solving and basic numeracy out of each problem. Parents can't look at sample raw results and figure it out for themselves. It's a complicated process and the details are not published.

There is no need for this. Tests can be designed to evaluate individual, required skills. Parents should be able to examine the actual questions and see the raw results in terms of percent correct. With the raw data scores, parents can and will look at the questions and determine for themsleves whether the tests are easy or hard. Most modern scoring results are far removed from the raw data and encourage (on purpose?) the examination of uncalibrated numbers.

It's amazing to see how really bad raw percent correct data get manipulated into final percent numbers that are in the 80's and 90's. Notice the TCAP table with almost all of the percents in the 90's. Even school 23 at the bottom of the red zone for average yearly achievement gain has TCAP scores of 93 and 94 percent. With numbers like that, how could the slope be much of a problem? Well, it's in the red zone. What, exactly, does that mean? Is it common to flip from the green zone to the red zone each year?

You could get a school that is operating at the top with little room for improvement. Its slope might be flat or slightly negative at times. If you look at the very high values of the TCAP results, then who's going to worry much about slope, other than the fact that a school might be in the red zone and red is bad. People will argue, but they won't really know what they're talking about. Indian Lake Elementary is a great example. It's TCAP scores are 98 and 99 percent, but it's slope is getting near the red zone.


States should send home actual tests giving the answers and raw percent correct results right after each question. They should also provide a table giving the results over many years side-by-side with other schools.