Why do computers still struggle with voice recognition? Your initial response is due by Day 4.
I was hoping this might be covered in one of the chapters, but I didn’t see it there. I don’t know the exact answer, but I suppose I could hypothesize using what I have read in the chapters on language. First of all, there are about 46 phonemes and about 600,000 words in the English language. That is a large task in and of itself. A computer would have to store the phonemic sounds for all 600,000 words. Second of all, a computer would only record a speech stream, rather than the cognitive representation of words and breaks that we perceive. In this way, the computer encounters the same obstacles that a baby does: How to decode a speech stream into individual words. Evidence shows that we use phonemic and syntactic cues to help us determine the end of one word and the beginning of another word. So now the computer has to store the phonemic and syntactic information that differentiates a speech stream into individual words, as well as the words and phonemes in question. Now we get to the obstacle of context. As was made readily clear in the text that I read, there is much about what we read that is inferred from context and secondary memory. So not only would a computer need to store all 46 phonemes in any variety of combinations, 600,000 words in any variety of combinations, the phonemic and syntactic information that differentiates a speech stream into words, but it would also need to contain a working memory of everyday human life in order to infer context and situation. After reading this chapter it is not hard to see why Microsoft Reader sounds so funny at times.
Willingham, D. T. (2007). Cognition: The thinking animal. New York, NY: Pearson Prentice Hall.