I've emailed palisadesk & now am posting.
Here is a reading issue I've seen before: a student who, reading out loud, replaces prepositions with other prepositions, sometimes reads verbs incorrectly (e.g. "walk" for "walking" or vice versa), and sometimes reads the wrong pronoun (e.g. "she" instead of "he" or vice versa).
It's as if the student is predicting the next word he or she expects to read, reading that word out loud, and not noticing that the prediction was wrong.
What is going on in cases like these?
And what does one do about it?
Subscribe to:
Post Comments (Atom)
45 comments:
This type of prediction is fundamental to human cognition and is especially noticeable in language. It is why, when someone else abruptly pauses or gets stuck on an awkward pronunciation, you feel an urge to finish what they were about to say. It's because it is what YOU were about to say in your mind. Electromyograms of the vocal tract show that you mirror the speaker and suggest that in your native language you may follow them so closely that if they suddenly stop, you almost run over them as you keep going.
This lookahead feature of cognition is probably an evolved mechanism for dealing with real-time phenomena. If you had to rely entirely on blank-slate processing of incoming percepts, you would be too far behind. Instead, you predict what is coming (often making multiple predictions, each with its own estimated likelihood), and you use your own predictions as evidence, alongside the incoming images and sounds.
Sometimes, especially when rushed, you give your predictions even more evidentiary weight than the incoming percepts that may take too long to process thoroughly. If you're a fast talker but a slow reader, you may feel the need to glance at the first letter or two, match it against your predictions of what you would probably say next, pick the most likely candidate, and move on in an effort to maintain a rate of speech you consider acceptable. We all do this mixing of predictions and percepts, but those of us who read faster don't have to rely as much on our predictions to keep up with the speed of speech and can put more weight on the visual evidence, so it isn't obvious.
Since I read faster than I speak in English, you wouldn't notice me making many mistakes when reading aloud, and I would catch almost all that I made. But in Japanese, I speak faster than I read, and I make these sorts of mistakes all the time. And in a language that I speak poorly, I have so little ability to predict that I am forced to slow way down and read only what I see. (Second language learners always go through this last one, but native speakers start off speaking at a normal pace before they learn to read, so their situation is different.)
The solution? 1) Reading fluency: get faster at reading than you are at speaking, and 2) Speaking fluency: train yourself in the patterns of the language so you make better predictions, which require less evidence from the writing for your choices to be correct.
I don't know much about the teaching reading part, but I know I do this with my writing. While proofreading (assuming, of course, that I take the time to proofread), I'll often miss a mistake because I read the word I meant to type, not the one that's actually there.
...so... this kind of thing feels very, very common. But I hadn't thought about the mechanism behind it. That is very fascinating! The ability to predict what's coming next is huge...
~Luke
Yes, absolutely, the human brain is built to predict, not process (not exactly -- obviously we **can** process).
As I understand it, the reason AI never worked out is that people kept building computers that could process more and more stuff. Brains don't do that. Brains process as little as possible, then make a rapid prediction.
Jeff Hawkins is trying to create a computer that predicts,...which I actually find a bit scary --- !
(I don't know that there's any reason to find it scary -- haven't thought it through)
Still waiting on palisadesk....(hint hint)
And Erica & Jen....
btw, "reward prediction error" probably explains a lot of the weirdness with the Times reporting on the July jobs report.
The number was greater than expected, which produces a reward prediction error inside the brain, which produces a big, juicy burst of dopamine.
Unexpected good news is always great news.
You probably have to let all that extra dopamine get sopped up by your receptors before you can realize the jobs report was the same new as always.
"As I understand it, the reason AI never worked out is that people kept building computers that could process more and more stuff. Brains don't do that. Brains process as little as possible, then make a rapid prediction."
There is a *LOT* more to it than that :-)
One (of many!) aspects is that it is a lot easier to do something when you know how ... so, as an example, I'll ask a question: How powerful a computer do you need for a program to play Master level chess?
It turns out that the answer is: not very much! You can purchase, today, machines that play Master level chess that have less compute ability than the original IBM PCs. Which came out in 1981.
But you could *not* get this software in 1981. In 1983, the fastest chess computer in the world was Belle, a machine built by Bell labs only to play chess. It was about as good as the programs you can run today on a 1981 IBM PC.
This isn't/wasn't because the guys didn't know how to look ahead and make predictions. It is/was because figuring out how to do something very new takes a long time!
Another issue (again, one of many) is that once we know how to do something, it is no longer considered artificial intelligence :-)
So the chess playing software *was* AI in the 1950s - 1980s, but somewhere in the 1990s playing chess was no longer AI.
-Mark Roulo
There is a *LOT* more to it than that :-)
No doubt!
That's Hawking's argument, btw. The model AI people had of intelligence - or of cognition, I guess I should say - is that the brain is a 'processor.'
But it's not, really.
The brain is a predictor.
This all goes back to precision teaching & fluency -- one of the things you achieve via fluency aims is rapid, nonconscious prediction.
Rapid and accurate nonconscious prediction, I should have said.
Chess playing software works via processing, not predicting.
That's not the way the brain does it.
I don't know this particular student, nor much of anything about him except I did infer from Catherine's message to me that he has been a struggling reader for some time. This likely rules out the type of "prediction" that others here are discussing -- predictions based on sophisticated language competency and rapid ideation. You do find occasional strong readers whose oral reading contains significant errors because they are reading so far ahead of the printed words and they can't keep up, so to speak... but this is relatively rare. In controlled studies, competent university-level students read complex text aloud with near-perfect accuracy -- the modal number pf errors is zero (so much for "miscues" -- a flawed hypothesis accepted as an axiom throughout elementary education).
What we see with struggling readers is something different, but although there is topographical similarity in the behavior, the variables involved are not always the same. Young readers who make multiple errors of this sort often have one or more of the following issues:
(1) weak phonemic decoding skills. The more they are dependent on memorized "sight words" the less likely they are to get variations on the word accurately (like, liking, likable, unlikely etc.) When tested, some of these students have very poor word-attack skills and phonic knowledge.
(2) limited working memory -- thus they are tripped up in reading longer sentences or clauses by an inability to hold all the components in mind, so that changes in verb endings, singular/plural or temporal indicators don't tweak their "wait a minute, that doesn't make sense" response.
(3) visual-spatial issues. MANY children have difficulty tracking print easily and automatically. These visual efficiency skills are developmental and improve with age, but need to be nurtured and reinforced. Lack of close monitoring of what children do when reading in class encourages a number of students to look all over the place for "clues" for "word-solving" -- at the pictures, at the beginning or ending of a word, searching for "little words" in the "big word," literally, eyes jumping all over the place. Smooth l-r progression is not established.
(4) impulsivity. Impulsivity doesn't cause this problem, but it can impede its solution, because the student is anxious to do the task quickly, often non-reflectively (this not picking up on obvious errors)
(5) lack of automaticity in component skills, so that the reader is not able to juggle awareness of meaning at the same time s/he is reading the text. This often relates to some of the issues already mentioned.
(6) real visual processing issues. Unfortunately, the whole field of "vision therapy" is rife with quackery and a lack of empirically validated data, but there is some evidence to support the contention that in an indeterminate number of cases, skills such as accommodation, convergence, tracking, etc. can be significantly improved by focused practice. Regular ophthalmologic eye exams do not test for these skills, and we have no real data on how many kids would benefit from this type of intervention (which is expensive and not covered by most medical plans). However, I have seen 2 cases where excellent teaching and well-above-average language skills and effort were not getting results until the student underwent some remedial exercises for visual efficiency skills under the direction of a developmental optometrist. Unfortunately it's definitely a "buyer beware" area. There is some useful general information at childrensvision.com.
Now, WHAT to do about it is a separate issue.... a more practical one. Suggestions forthcoming.....
So glad to see palisadesk has weighed in.
Haven't read yet, but wanted to pull out this observation:
In controlled studies, competent university-level students read complex text aloud with near-perfect accuracy
Right --- and that, I believe, is what people are talking about here.
The reason a good reader can read so rapidly is that he/she can rapidly predict what word comes next, and be correct.
As far as I can tell, fluent performance in every realm is based in rapid, nonconscious prediction.
The famous Wayne Gretsky line about playing where the puck is going to be is true of all proficient performance. (His line about merely good skaters playing where the puck **is** also requires rapid, nonconscious prediction.)
I have a theory that this goes hand-in-hand with spelling. If a reader is more-careful reading words, they are likely to see the spelling and have it sink in--these are more likely to be natural spellers.
On the other end, are readers who quickly jump to conclusions about a word and move on to the next, without actually looking at the letters and seeing the spelling. These are the people for whom spelling would come with much more difficulty.
My nephew (who's 10) and I are definitely in the later camp. When we were working on teaching him to read, there was a long period of time when his biggest problem was jumping to a conclusion about a word too quickly, and not actually reading the word on the page. He's now turned out to be a lousy speller. I'm a lousy speller, but I have no idea if I was quick to assume I knew the word or not.
limited working memory -- thus they are tripped up in reading longer sentences or clauses by an inability to hold all the components in mind, so that changes in verb endings, singular/plural or temporal indicators don't tweak their "wait a minute, that doesn't make sense" response....
impulsivity. Impulsivity doesn't cause this problem, but it can impede its solution, because the student is anxious to do the task quickly, often non-reflectively (this not picking up on obvious errors)
My 2nd child has been diagnosed with ADHD and when he's reading aloud makes these kind of mistakes even though he's an excellent decoder. I absolutely attribute them to impulsivity and not such a great working memory. When this happens, I make him stop and slowly re-read the sentence.
I'd love to do some CogMed training with him but it's not covered by our insurance and isn't in our budget to pay for out-of-pocket.
"The reason a good reader can read so rapidly is that he/she can rapidly predict what word comes next, and be correct."
This is not what the research has shown.
Gough and Wren found that competent university students were able to predict upcoming content words only 10% of the time.
Stanovich and West set out to confirm Frank Smith's theory that good readers are good because of their excellent use of context. Yet these researchers found that the opposite is true: it is the poor readers who rely on context and the good readers who decode so quickly and automatically that they do not need to use context.
Also, the good readers tend to be much better at reading words out of context (e.g. in lists) than poor readers.
Automatic decoding is crucial to reading speed... and to comprehension of what is read. Context is used by skilled readers to determine the meaning of unfamiliar words, not to determine what the words are.
This is not what the research has shown.
Gough and Wren found that competent university students were able to predict upcoming content words only 10% of the time.
10% is 10%. If you can't predict any words in a text, you're going to read slower -- and I suspect you're going to use up more energy stores, but that is just a guess.
I think Stanovich and West's research has probably been superceded by the NYU study:
Pelli and Tillman's results show that letter-by-letter decoding, or phonics, is the dominant reading process, accounting for 62 percent of reading speed. However, both holistic word recognition (16 percent) and whole-language processes (22 percent) do contribute substantially to reading speed. Remarkably, the results show that the contributions of these three processes to reading speed are additive. The contribution of each process to reading speed is the same whether the other processes are working or not.
<a href="http://www.sciencedaily.com/releases/2007/08/070801091500.htm>Phonics, Whole-Word And Whole-Language Processes Add Up To Determine Reading Speed, Study Shows</a>
As I recall, "whole-language processes" are things like knowing that the next word after a preposition has to be a noun --- I **think** 'whole-language processes' essentially means knowing phrase & clause constituents.
What I'm interested in with readers like the one I described is .... well, two things, I guess:
* why they make the wrong prediction (for instance, coming up with 'she' when the person being referred to is male
* why they don't notice they're wrong
++++++
I should add that I'm pretty sure I've seen good readers put their own words into a text, but in their case the words, while wrong, make sense & don't change the meaning.
I think Stanovich and West's research has probably been superceded by the NYU study:
Be very very careful with that line of reasoning. Pelli and Tillman's study was on the reading skills of adults, while most of Stanovich and West's work (they did several important studies) was on children.
A key feature of Whole Language ideology is that we project onto very young children the characteristics of mature, accomplished readers, writers and speakers. We want Kindergarten children who can't hold a pencil properly to draft, edit, revise and "publish" as if they were college students. There is a clear cognitive error here.
We cannot use the demonstrated behaviors and repertoires of skilled adults as a guide to what young learners may or may not be doing.
Very young poor readers are doing very little "predicting" -- they are stumbling through the text with a mix of half-baked skills and guesswork. In fact in the early days Engelmann referred to children who made these kinds of errors as "stumblebum readers," and blamed their instruction for this behavior, which Corrective Reading was developed to, well, correct. Later on many different levels of CR were added but the first ones addressed this specific problem. Effectively I might add.
What I'm interested in with readers like the one I described is .... well, two things, I guess:
* why they make the wrong prediction (for instance, coming up with 'she' when the person being referred to is male
The student in question is not "predicting" -- I'll bet the bank on that. He's doing something else altogether.
With respect to the efficacy of prediction in lieu of decoding (prediction is for MEANING, not word identification), consider this from http://early-reading.com/30-years-of-research/seven-key-principles/:
Text that is less decodable requires the children to use prediction or context to figure out words. Much research has evaluated the effectiveness of prediction as a strategy for word recognition. Though prediction is valuable in comprehension for predicting the next event or predicting an outcome, the research indicates that it is not useful in word recognition. The following passage is a sample of authentic text (from Jack London). The parts of the text that are omitted are the part that a child was unable to decode accurately. The child was able to decode approximately 80% of the text. If prediction is a useful strategy, a good reader should be able to read this easily with understanding:
He had never seen dogs fight as these w__ish c__ f__t, and his first ex____ t____t him an unf______able l_____n. It is true, it was a vi___ ex______, else he would not have lived to pr__it by it. Curly was the v____. They were camped near the log store, where she, in her friend_ way, made ad_____ to a husky dog the size of a full-____ wolf, th____ not half so large as _he. __ere was no w__ing, only a leap in like a flash, a met__ clip of teeth, a leap out equal_ swift, and Curly's face was ripped open from eye to jaw.
It was the wolf manner of fight__, to st___ and leap away; but here was more to it than this. Th____ or forty huskies ran -o the spot and not com___d that s____t circle. But did not com____d that s____t in_____, not the e___ way with which they were licking their chops. Curly rushed her ant_____,who struck again and leaped aside. He met her next rush with his chest, in a p_____ fash___ that tum___ed her off her feet. She never re____ed them. This was __at the on____ing huskies had w_____ for.
The use of predictable text, rather than this authentic text, might allow children to use prediction to figure out a passage. However, this strategy would not transfer to real reading, as the above passage demonstrates. Predictable text gives children false success. While this false success may be motivating for many children, ultimately they will not be successful readers if they rely on text predictability to read.
Be very very careful with that line of reasoning. Pelli and Tillman's study was on the reading skills of adults
Right.
I'm talking about the reading skills of adults.
As I understand the basal ganglia & proficient performance in any realm, prediction becomes possible with fluency; it's the result of fluency.
btw, the work on experts, on how experts make decisions, is relevant here.
People used to think that experts, faced with a problem they needed to solve, generated options and ran a little comparison-and-contrast algorithm. Chess players generated options & compared; firefighters generated options & compared; etc.
Now researchers believe that's not the case.
Experts generate just ONE possibility. That's it. (Then I think the idea is that their brains **do** run a little simulation...and if the simulation pans out, that's it. That's the decision they make.)
Basically, brains are built to automate everything they possibly can automate, and automation results in the brain functioning as a prediction machine rather than a processing machine.
At least, that's the way I understand it at the moment.
Prediction is for words, too. Pretty sure. I just found the study...this is interesting:
Readers can predict the next word in a passage 20 to 35% of the time, depending on their reading experience [19,22].
They're citing Stanovich & Gough .... I have the Stanovich. I'll try to see what he says.
20% to 35% of the time is what I would have predicted, not 10%. (Didn't someone score 35% on SAT reading just reading the questions? Was it Phil Keller?)
"However, both holistic word recognition (16 percent) and whole-language processes (22 percent) do contribute substantially to reading speed."
The article you reference does not give enough information to me about the three "adult" readers involved in the study. Adult readers range in skill. Were all three readers in the study excellent readers? Or were some simply "decent"? If these were "decent" readers, perhaps they use context more than excellent readers do. I feel certain that this is the case with the K-6 students I work with.
I would imagine their subjects were NYU students, so they'd be very good readers.
Context just means grammar, and I'm 100% positive really good readers have a really good sense of syntax, certainly good enough to know that, say, a participle comes after an auxiliary verb.
These authors are looking at object perception: how to people know, very rapidly, what the thing they're looking at is?
That's all.
They're not looking at comprehension.
They found that proficient readers use knowledge of sentence structure to read faster.
btw, I think grammar is one of the radically underappreciated -- and under-researchered -- aspects of reading.
The phrase "whole language processes" in this study means grammar.
I have a theory that this goes hand-in-hand with spelling. If a reader is more-careful reading words, they are likely to see the spelling and have it sink in--these are more likely to be natural spellers.
I believe it!
Very interesting.
Must read all the rest of the comments closely ---- & get palisadesk's pulled up front ---
I wonder whether fast, accurate reading coincides with good spelling coincides with good copy editing.
I'm a pretty decent copy editor (I think!); tiny little errors in spelling & punctuation tend to jump out at me (though my online copy editing skills are declining, I fear...EYESTRAIN)
I'm looking through Stanovich now, and I don't find a single reference to grammar!
It's really pretty remarkable when you realize that language is grammar .... not words per se.
Autistic kids with severe language disorders have words.
What they don't have is grammar.
I get the sense that grammar has simply been assumed in a lot of reading research.
Arthur Whimbey & David Mulroy are the only people I've seen who show students failing to comprehend text because they can't parse the grammar.
If prediction is a useful strategy, a good reader should be able to read this easily
I don't see how the premise implies the conclusion. It sounds like the claim that if drills were useful, a good carpenter would be able to build a birdhouse easily with just a drill.
But maybe what you mean is just that explicitly teaching kids to deal with unrecognized words by guessing at them without attempting to phonetically parse them is a terrible approach. If so, I totally agree with you.
On the off chance that you mean that the unconscious, emergent ability to predict words is not useful, I would disagree, even though it won't single-handedly fill all gaps. Some ability to predict underlies the superior fluency of good readers. You cn read this sentecne withhot too much trebble despit mispsellings that dan't even match the pronunciation, becuse strategies other than phonetic parsing contribute usefully to reading.
I went to college with a lot of folks who were fast, accurate readers and lousy spellers. The common thread appeared to be that these people were very early readers, and they learned to read by shape rather than really phonetically. They also often had really good memories, so the "here are ten random words learn them" method of spelling teaching never stuck. They'd remember the words for the test and forget them a week later. The student I've talked about before who had trouble with "gly" words in biochem fit this pattern.
Which is why I started doing phonics with my three year old once she started to follow this pattern, and why I really wish I could find a school that used All About Spelling. (and Singapore Math, but that's really dreaming in my neck of the woods).
Auntie Ann's theory may be true in her case, but it is not valid for the general case. Nor, ironically, does good phonics instruction for reading necessarily carry over to spelling.
I have had students (and this phenomenon is well reported on in the literature) who were taught to read through a code-based approach, were proficient readers, yet whose spelling was horrific. IQ was not a significant variable, either (my best speller of all time had an IQ of 45). In middle school, some of these students were still regularly writing "uv" and "enuf" for words they had seen tens of thousands of times.
It turns out that reading/decoding and spelling/encoding call on different skills sets and types of memory, and employ different neurological circuits. Good natural spellers have at least two things in common: an excellent visual symbol memory (not the same as memory for images or designs) and very well developed auditory discrimination and retrieval. Curiously, some excellent spellers can't read what they spell! I've had several students like that, too. So much more than just "paying attention" or knowledge of phonic skills is involved in spelling competence.
While we have a good handle on how to teach basic reading/decoding skills to most (not all) students, the same is not true for spelling. A systematic structured approach is the most successful, but for many students spelling is MUCH harder to master than reading.
palisadesk - what curricula do you like for spelling?
My brother is a good reader and a horrible speller, but whose spelling mistakes always make phonetic sense. Phonics rules don't help a student remember the correct vowel to use in an unstressed syllable or which vowel team to use or which words have silent letters, etc.
I'm helping my oldest child prep for the Macy's spelling bee next month, and again her mistakes always make sense from a phonetic standpoint.
Following up on Glen's second comment, prediction isn't a 'strategy,' exactly --- at least, not in the sense I'm using the term. It's not something you do consciously, and it's not something you're taught to do.
It's what your brain does when you've reached automaticity. Brains predict.
I'll find some good passages in the Hawkins/Blakeslee book. I think I remember him saying that we can tell that a cat is a cat when we see just a bit of a cat's ear poking out from behind a sofa --- and that no one's been able to teach a computer to do this yet. We predict 'cat': we lay a bet that the part we see belongs to a cat.
Computers don't predict'. Not yet. (I'll check that story.)
Of course, mulling this over got me to wondering whether it would ever be a good idea to teach perceptual prediction explicitly .... ??? I have no idea. I've been taking tennis lessons for ages now, and I'm constantly told to keep my eye on the ball, but my teacher never tells me to predict where the ball is going to end up, although that is what I actually need to do & what keeping my eye on the ball achieves.
He does tell me about blind spots, those points in the ball's trajectory when you actually can't see the ball whether your eye is on it or not.
He'll tell me "Keep your eye on the ball" and then follow up with the observation that it's actually not possible to keep your eye on the ball throughout the entire trajectory of the ball....
In any event, I got much better as soon as I realized I could tell from the way my teacher moved his tennis arm which way he was going to hit the ball and where it was going to come.
This is all related, I think, to the phenomenon of people not being able to perceive to the's in a row.
We see what we expect to see, and what we expect to see is a prediction.
Lots of good stuff at the MRC Cognition and Brain Sciences Unit"
Catherine, it is false that computers don't predict. False. I don't have a clear idea of what you mean, or think you mean, but computers use a whole Variety of models to predict.
I have written countless prediction algorithms. Kalman fitters, hidden markov models, Bayesian belief nets, all of these and more are prediction models. Using them, computers predict the future location of a missile or tank, the likely price of soybeans tomorrow, the load on a server, the next word to be spoken or written, and far more complex things like the elements in a social network likely to commit a terrorist act, the type of RNA segment that will build a desired protein shape, and more.
In grad school, I did research on human vs. computer recognition of degraded text. (I have the first published paper on using degraded text for CAPTCHAs, in fact, but as my husband reminds me, did not think to apply for a patent....)
this is something computers do poorly at compared to humans (which is why they make good CAPTCHAs).
Duong my research I showed that humans could recognize text just from seeing the top line characters are bounded by and above. Meaning, the top of an a, the stick and top curve ofa b, etc. With just 1 mm on 32 point font, humans could recognize well known words. With no context at all.
But the "well known" ness of the words mattered. Grad students could recognize "home" or" school", but only the grad students in my subfield would recognize "entangle" because that was a common word in quantum computing and in nothing else.
I was wondering whether Allison was going to get to this before I did. It's not that "AI never worked," it's that treating human cognition as formal logical reasoning didn't go very far and required a change of approach. It seems that the human brain can simulate formal logic, but that's not its native mode of operation. But formal logic doesn't have to be the approach taken by a computer, either, and these days it often isn't.
For the past twenty years or so, the emphasis in AI research has been the sort of statistical approaches Allison mentioned. These approaches take incomplete and inconsistent data and quickly reach conclusions (or "predictions") based on likelihood, which is what the brain is so good at.
Almost all handwriting recognition, voice recognition, language translation, web page finding, and Jeopardy winning is done by some combination of these statistical methods that learn by "processing lots of data," which is what the brain does, too. Instead of being programmed with logical rules, they learn patterns by processing lots of messy data, and they use their learning to make quick predictions from other messy, partial data.
Using such methods, modern computers certainly can recognize a cat. Forget cats, they can now recognize Allison's CAPTCHAs better than most kids and, soon, better than any human. CAPTCHAs, a great idea at the time, will become a historical curiosity.
Computers can predict so well it's sometimes scary. Have you ever had your credit card temporarily blocked until the card company verified that it was really you? A computer had been watching you live your life, silently predicting what you--the individual it had come to know--would and wouldn't do that day. It had been predicting your behavior so well that you rarely surprised it. That day was just an exception--it could happen to anyone. That computer knows what kind of cat you are and can predict some of your actions better than most of your friends can.
Have you ever had your credit card temporarily blocked until the card company verified that it was really you?
But....that's probably an example of what's different about computers versus brains, isn't it?
That 'prediction' was wrong -- and it was wrong in a significant way; it cost you time and nuisance.
The brain seems to make an awful lot of correct predictions.
Offhand, I don't think that "predictions" made on the basis of massive quantitites of data are the same thing as brain predictions.
Brain predictions, as far as I can tell, are Bayesian (assuming I'm using that term correctly).
Brain predictions aren't made on the basis of huge quantities of data. Just a couple of data points produce predictions.
No, that's pretty typical of statistical decision making, whether done by machines or humans. You make quick evaluations of the meaning of a bit of new data, which are usually right but not always. You usually recognize the handwriting correctly but occasionally misread it. And if it's the doctor's handwriting you misread, you could be wrong in a very significant way.
In fact, the credit card computer may well be using a bayesian algorithm. Given its prior beliefs about you and some new information (what your card is being used for right now), it compares two hypotheses: this is Catherine (likely) behaving oddly (unlikely behavior for Catherine) vs. this is a thief (unlikely) behaving typically (likely behavior for thief). There will be some degree of belief in each, and if belief in the latter hypothesis exceeds a certain threshold, you'll be flagged. The computer isn't deciding that you are a thief; it's just suspicious. Humans do this all the time.
And brain predictions ARE based on huge quantities of data. That's where the bayesian priors come from; that's what wires up the neurons. You can't predict the next word in an English utterance before you learn English, learn about the topic being written about, learn about people such as the writer, and so on, all of which require processing huge quantities of data.
But you're also right about the brain making predictions based on little data. Statistical inference methods, whether brain- or computer-based, use data in two ways: in training and in use. The former typically requires a lot of data so the latter won't. You have to observe your tennis instructor swing thousands of times before you can predict where the ball is likely to go with just a glimpse of a shift in weight or change in head posture.
I'm glad Alison and Glen got to this first. There is no doubt that machine learning methods (which include Bayesian techniques and many other ways of generalizing from examples) make predictions all the time.
Unlike I Catherine, I've had the credit card company correctly recognize that my credit card number had be misappropriated—before the charges were posted. I believe that their false positive rate has been about 75% (that is, 3 false positives for each true positive) in my tiny sample. Given the rather small cost of dealing with a false positive and the huge cost of a real identity theft, I've very glad they are making those predictions, even if they aren't making them as well as we might like.
I don't mean for this thread to be "let's gang up on Catherine" but I really hope you take what we're saying to change your conception of inference in both humans and machines.
actually, human brains are probably worse at predicting than machines are for whole hosts of problems. Automated cars drive with fewer errors than humans, because their predictions don't worsen with fatigue or distraction. Machine learning algorithms are better at predicting if a slide of cells is cancerous than human brains are, probably for similar reasons. Machine Learning algorithms are much better than people at predicting fraudulent transactions because the algorithms aren't swayed by emotions that relate to trust, compassion, and embarrassment the way human brains are.
the brain makes incorrect predictions all the time, too-- and sometimes, they are fatal. But we have other mechanisms in our employ to limit the severity of the effect.
Ever use too much force to lift something you expected to be heavy but was light? Or end up late because you misjudged how long something would take you? Humor is all about incorrect predictions.
I'm not sure what you mean by little data, either. Brains are predicting from the time they are in utero, and they are taking in data from then too. that's an enormous amount of data, and it wasn't a blank slate to start with.
This isn't ganging up. It's not a political dispute; it's an update on what's going on in an area of technology by people who work in it for someone who's clearly interested in it. For someone so interested in human learning, machine learning is a fascinating area to explore, and it has changed dramatically since the early rules-and-reasoning approaches to AI stalled 20-25 years ago.
And something just occurred to me. Hawkins's book, "On Intelligence," which I suspect you read, Catherine, might be interpreted to be saying what you're saying. In it, he was trying to make a distinction between traditional computer architectures and brain architecture and the differences in how they work when the traditional computer is programmed in a traditional way. Those differences are roughly as you describe and have often been referred to in popular literature, even by experts, as "computer-like" and "brain-like." It definitely makes it sound as though computers do one thing and brains do another categorically.
But electronic circuitry is so much faster than electrochemical circuitry (neurons) that you can use computers to simulate "brain-like" architectures and algorithms. Most software doesn't do so, for good reason. You wouldn't want your spreadsheet to get confused by similar-looking numbers. But some software (and even hardware) is "brain-like" and getting more powerful every year.
The book:
Artificial Intelligence: A Modern Approach
http://www.amazon.com/Artificial-Intelligence-Modern-Approach-Edition/dp/0136042597/ref=sr_1_1?ie=UTF8&qid=1344923254&sr=8-1&keywords=AIMA+norvig+russell
is 15 years old now, but is quite a good beginning textbook on how computers solve various problems. Its chapter on Bayesian inference is probably the best short technical intro you could find.
I'm a big fan of that one, too. One of the authors, Peter Norvig, treated me to lunch a couple of years ago. He's very involved in industrial computational linguistics (he's now Director of Research at Google) and has some opinions about Chomsky's ideas that are widely shared outside university linguistics departments. Readers who are interested in a "modern AI" response to Chomsky might enjoy reading this:
"On Chomsky and the Two Cultures of Statistical Learning"
http://norvig.com/chomsky.html
It's shorter and less technical than his book. The current (third) edition of Norvig's book is only a couple of years old and very up-to-date. It's a thousand-page undergrad textbook for a serious AI/machine learning class, though, so it is heavier reading. Even so, the book has a non-technical discussion/background section in each chapter. Reading just those chapter sections and skipping the tutorial and exercises would almost turn it into a popular science book.
I mean "Russell and Norvig's book," not just "Norvig's" book. I've never met Russell (I suspect you knew him at Berkeley, Allison), but I like his book and don't want to leave him out of the credits.
Glen, you're not at the Plex are you? I'd have loved to have met Peter! His LISP is so beautiful!
Yes, I did know Stuart. My path to grad school was more than circuitous, and I had no CS background while at MIT, so I took his ugrad AI class just after the book was first published. I also was dating one of his grad students a couple years before that, and he was social with his students, so I even had thanksgiving dinner with him once.
And since I'm pointing out odd connections and Small World effects, Sandra Blakeslee is a close friend of my mother in law, and her kids were childhood friends of my husband and his sister.
Having not read Hawkins' book, Glen, do you mean he meant traditional architectures meaning RAM models of computation? Or TMs? Or more low level hardware structures of real computers? I can see thinking "a RAM model or a Turing Machine isn't predicting." But neither is the neurotransmitter or mRNA. The system does so at a different level of abstraction.
No, I'm not at the Plex, but I've been doing (independent, industrial) research on cognitive models and second language learning at Stanford, and our paths crossed. He still loves Lisp (I do, too), but he's teaching his daughter Python instead (as I'm doing with my son, who's the same age as his daughter.) He takes a very business-like position on programming languages for production at Google, valuing maturity of total platform (implementation, libraries, tools) over any other factor. You mentioned once that Google wants to know if you are Python, Java, or C, and I had to laugh. I suspect that comes from him.
As for Hawkins, I read his book and put it in the garage about a decade ago, so I'll try my best to remember what it said. He wanted to make a clear distinction between the "computer way" and the "brain way" for a popular audience, where the former was the traditional, non-learning, non-inferencing, "data processing" approach that you might use to write your budgeting software in COBOL or whatever. I think he did talk about Turing Machines and von Neumann architectures (not RAM models specifically, that I recall) but it was more informal, as in, "You know the computer you use? THAT's what I mean." That sort of blends hardware and software together.
Then he contrasted that to the brain, which learns from data instead of being explicitly programmed, and which has emergent inferencing abilities due to its neural architecture. He probably said something along the lines of, "unlike computers, brains predict." Well, in the sense he meant it, yes. In the sense of what computers do these days, no, but he knows that.
He talked a lot about hierarchical processing, but I'm not sure he made the point that when you have levels of abstraction, the nature of the thing can differ from level to level. Neural hill climbing, for example, won't emerge naturally from a von Neumann architecture, but it can certainly be layered on top of it, making it computer-like at one level and brain-like at another.
Also, terms like memory and prediction don't mean the same thing at every level of abstraction. If you are multilingual and go for some time without using one of your languages, your brain will inhibit recall to keep it out of your way. Words that are remembered at the neural level (revealed by priming studies) are forgotten at the level of speech. While "memorizing" at the neural level provides a foundation for memorizing at the classroom level, they are not the same thing. Likewise, "prediction" at the level of reading may differ from the kind of "prediction" he's referring to at a lower level.
I liked the book a lot. It's just that, like most popular science, it had to omit a lot of things. That makes such books easier to understand and easier to misunderstand.
Post a Comment