kitchen table math, the sequel: what is an average, anyway?

Sunday, August 29, 2010

what is an average, anyway?

A friend of mine asked me to walk her through mean, median, and mode --- and it came to me, thinking about it, that I don't exactly know what an average is beyond the obvious.

Another issue: I'm so sick of my own child (& everyone else's) being lost in group means that I've come to feel some real antipathy towards the very concept of a group average. Meanwhile the concept of a personal average makes sense and seems obviously useful ----

Here are my questions.

What is useful about averages? 

What do averages tell us?

And why did the calculation of averages come to be so important culturally?

5 comments:

Anonymous said...

Mean and median are both "measures of central tendency". They are ways of summarizing a large set of data in a quick way that simplifies the data and removes noise.

If you know that the average rent for office space is $4 a square foot, and you need 1000 square feet per person, then you can quickly judge whether you can afford an office for your staff, without having to go through hundreds of real-estate listings. Of course, you also need to know something about the variance, and not just the mean, as there may be a few not-so-nice places that are much cheaper than average.

Catherine Johnson said...

summarizing a large set of data in a quick way that simplifies the data and removes noise

That's exactly what I was looking for.

I just came back from a walk with a friend who worked in financial realms for many years; that's exactly what she said. Means are a way of simplifying data.

Of course I 'know' that ---- but I couldn't put it into words.

This is analogous to what happens with SAT vocabulary. I 'know' all the words in the Princeton Review set - but I can't define them for someone else!

It's strange, and I don't understand it.

I'm actually having to memorize definitions ---- !

Anonymous said...

One thing I never realized until I started teaching is why we calculate the sum, then divide by the number of numbers to find mean. We're actually finding how much each person would get if the items were distributed equally. So if there were 3 apples, 5 apples, and 10 apples, if we combined them we would have 18 apples. Dividing by 3 gives 6 apples per person if the apples were shared equally. This also shows why we use median rather than mean for items like income where there can be outliers that skew the results when the items are shared equally.

ChemProf said...

A lot of the fascination with averages in education came from the IQ research in the 50's. If you have enough students, then any parameter you want to measure will create a normal or Gaussian distribution. For example, when I TA'd Gen Chem at Berkeley, with 1500 students, the final scores fit a Gaussian amazingly well (although there was a sharp cutoff at the high end since you can't get over 100%) If your data fits a true Gaussian, then the mean, median, and mode are all the same. To the extent that they aren't the same, the difference is telling you something about how far you are from a Gaussian distribution.

However, with small numbers, these kinds of statistics aren't useful. In a class of 30 students, you aren't going to be close to a normal distribution, and even in a class of 1500, you probably will have some deviation. Also, averages only tell you about the mid-range of the population. They can be useful for figuring out where most students will be (since about 2/3 of students will be within one standard deviation of the average and 95% will be within 2), but population averages tell you nothing about a particular student.

Glen said...

Catherine, you're right to be skeptical of averages. Averages are meant to simplify and clarify a description of a group by throwing out the details of individual or sub-group variation and describing the central tendency of the whole group. If the whole group has anything other than ONE stable, central tendency, the average will be misleading--perhaps outrageously so.

There are situations with NO central tendency. You can always calculate an average of any sample, but it won't tell you anything about the mean of the population if the population mean doesn't exist. (You can average any sample of positive numbers, so what's the value of the "average" positive number?)

There are samples generated by unstable processes, where each individual value is generated by a different process. If you average a sample of those values, what is the average describing?

And there are plenty of situations with multiple central tendencies. Suppose IQ varies by ethnicity, each with its own central tendency, and you measure the average IQ of multiethnic ABC High School each year for ten years and find it going down. So someone is apparently getting dumber. Who? If you average each of the ethnic subgroups to find out, you could find that each one is getting a little smarter each year. Huh? (Each subgroup gets smarter, but the lowest-scoring subgroups make up a higher percentage of the sample each year: every part goes up while the "average" goes down!)

This is the sort of trouble you get when you take the average of so many real-world phenomena, because "average" includes an unstated claim that there is a single, stable central tendency, and often there isn't. If there isn't, then I don't think that average means what you think it means.