kitchen table math, the sequel: automated essay grading

Thursday, April 4, 2013

automated essay grading

In the Times today:

Essay-Grading Software Offers Professors a Break By JOHN MARKOFF
Published: April 4, 2013 | New York Times

I'm actually in favor of essay grading software, in theory. I've been interested in automated essay scoring ever since reading Richard Hudson's paper Measuring Maturity in Writing (which I need to re-read, so nothing more on that at the moment):
Abstract
The chapter reviews the anglophone research literature on the 'formal' differences (identifiable in terms of grammatical or lexical patterns) between relatively mature and relatively immature writing (where maturity can be defined in terms of independent characteristics including the writer's age and examiners' gradings of quality). The measures involve aspects of vocabulary as well as both broad and detailed patterns of syntax. In vocabulary, maturity correlates not only with familiar measures of lexical diversity, sophistication and density, but also with 'nouniness' (not to be confused with 'nominality'), the proportion of word tokens that are nouns. In syntax, it correlates not only with broad measures such as T-unit length and subordination (versus coordination), but also with the use of more specific patterns such as apposition. At present these measures are empirically grounded but have no satisfactory theoretical explanation, but we can be sure that the eventual explanation will involve mental growth in at least two areas: working memory capacity and knowledge of language.
Maturity of writing, in this sense, can be measured by software, and I would be using automated scoring software myself if I could buy essay-scoring software on Amazon. EdX says it's giving software away free to 'institutions' (does that leave out individuals?) so I'll have to see if my department might throw its hat in the ring.

That said, a lot of this is nonsense:
Anant Agarwal, an electrical engineer who is president of EdX, predicted that the instant-grading software would be a useful pedagogical tool, enabling students to take tests and write essays over and over and improve the quality of their answers. He said the technology would offer distinct advantages over the traditional classroom system, where students often wait days or weeks for grades.

[snip]

“It allows students to get immediate feedback on their work, so that learning turns into a game, with students naturally gravitating toward resubmitting the work until they get it right,” said Daphne Koller, a computer scientist and a founder of Coursera.

[snip]

“One of our focuses is to help kids learn how to think critically,” said Victor Vuchic, a program officer at the Hewlett Foundation. “It’s probably impossible to do that with multiple-choice tests. The challenge is that this requires human graders, and so they cost a lot more and they take a lot more time.”
None of these things is going to happen. Students aren't going to write essay responses "over and over again;" if they do write essay responses over and over again it's not going to feel like a fun game; and nobody's going to learn to think critically from automated essay scoring software.

Oy.

4 comments:

orangemath said...

To some extent, Intellimetric and CTE Writer (embedded within Edgenuity and Odysseyware) offer this "type" of software. I've used it as a grading tool and sometimes chose a few essays to look at semi-randomly. (I teach math a a rule.) Yes, the software is easy to game, but knowing that the teacher may check helps.

If I used my time more wisely, the software can be used for grammar, etc. and then the essays could be reviewed by me for content and coherence.

This is a tough area. I really doubt that edX has it solved. The College Board failed in this effort. Getting to 100 % in effectiveness is hard.

PS, if you use Chrome; install the "Ginger" extension. You can see a basic version of this type of software in your own work.

Anonymous said...

"Students aren't going to write essay responses "over and over again;""

Yes they will, making random changes until they trick the software. The result essay will be far worse than what they started with, but they won't care (and may not even realize it).

cranberry said...

From humanreaders.org:

Let's face the realities of automatic essay scoring. Computers cannot “read.” They cannot measure the essentials of effective written communication: accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity, among others. Independent and industry studies show that by its nature computerized essay rating is
trivial, rating essays only on surface features such as word size, topic vocabulary, and essay length
reductive, handling extended prose written only at a grade-school level
inaccurate, missing much error in student writing and finding much error where it does not exist
undiagnostic, correlating hardly at all with subsequent writing performance
unfair, discriminating against minority groups and second-language writers
secretive, with testing companies blocking independent research into their products
(See Research Findings supporting these claims.)
In sum, current machine scoring of essays is not defensible, even when procedures pair human and computer raters. It should not be used in any decision affecting a person’s life or livelihood and should be discontinued for all large-scale assessment purposes.

SATVerbalTutor. said...

I couldn't resist posting about this as well: http://thecriticalreader.com/blog/item/343-essay-grading-software-gives-professors-a-break-from-what?.html