What Is Voice Recognition?
By: Top • Research Paper • 3,833 Words • February 25, 2010 • 883 Views
Join now to read essay What Is Voice Recognition?
First off, what is voice recognition technology?
Voice recognition is a computer application that lets people control a computer by speaking to it. In other words, rather than using a keyboard to communicate with the computer, the user speaks commands into a microphone (usually on a headset) that is connected to a computer. By speaking into the microphone, users can do two things. First, they can tell their computers to execute commands such as open a document, save
changes, delete a paragraph, even move the cursor--all without touching a key. Second, users can write using voice recognition in conjunction with a standard word processing program. When users speak into the microphone their words can appear on a computer screen in a word processing format, ready for revision and editing.
Voice recognition has gained tremendous popularity over the past few years. It has gone from imagination, to rumor, to reality and this trend is not going to stop. It this will be better explained by Jeffrey C. Scott from Computer Shopper in the paragraphs below.
“The ability to interact with computers by talking to them may help bridge the gap between human beings and machines. Once deemed a fantastical glimpse into a science-fiction-like future, voice-recognition technology has undergone several significant developments recently, and it now seems destined to move us toward a humanized computer interface. But that's for the future. At present, voice-recognition technology at its best lets us focus on our daily tasks and needs rather than on the computer's commands and syntax. Imagine inputting all your requests, numerous e-mail messages, routine correspondence and numerical data simply by telling them to the computer. Executives with limited typing skills, individuals who are physically challenged, and users who suffer from pain associated with the repetitive motion of typing can all benefit from voice-recognition applications. In fact, almost anyone can profit from this technology, though it may take some adjustment and a little patience to scale the learning curve.
Today's voice-recognition products are notably easier to use than those of the past. In addition, veterans of voice recognition should find that the new and improved dictation systems require substantially less training and customization.”
“The Virtues of Training :Training a voice-recognition system consists of a defined dictation session where you are prompted to say words, phrases, and sentences. This exercise, which can take anywhere from 20 minutes to an hour and a half, allows the computer to become accustomed to your pronunciation. After this session, the program calculates the results of the speech samples. When the entire process is complete, the speech engine is thoroughly tuned to your particular voice. You continually train the speech engine by correcting its recognition mistakes.
This review focuses on four voice-recognition programs for the Windows environment: DragonDictate for Windows 2.01, IBM VoiceType 3.0 for Windows 95, Kurzweil Voice for Windows 2.0, and Listen for Windows 95. These packages all have two basic functions--command and control, and dictation. DragonDictate, VoiceType, and Voice for Windows all seek control of the Windows operating system using dictation modes based on large vocabularies. Listen for Windows 95 has a more strictly defined dictation capability, focusing on number dictation and certain vocabulary words.
The packages were tested on a variety of Pentium and 486-equipped machines having from 8MB to 32MB of RAM. As voice-recognition packages are memory- and resource-intensive, superior performance was achieved by the machines having more RAM and faster microprocessors. To compare the programs, we dictated a variety of information, including sample phrases, letters, paragraphs, price lists, forms, and numerical data. And to measure the recognition engines' robustness, the test passages included groups of similar words, such as "there," "their," and "they're," or "to," "too," and "two." Also, a command and control obstacle course was set up to test the packages' ability to manipulate the Windows interface.
An examination of speech-to-text systems over the past several years reveals that prices have decreased dramatically, from thousands to merely hundreds of dollars. At the same time, usability and adaptability have greatly increased. These products have moved from being hardware-dependent to hardware-independent programs, no longer requiring a special or proprietary sound card. Many are also speaker-independent as well, meaning that the programs recognize with minimal training the speech of any user.
Voice-recognition software consists of two broad