INTEL RESEARCHERS TEACH COMPUTERS TO ‘READ LIPS’ TO IMPROVE ACCURACY OF SPEECH RECOGNITION SOFTWARE

INTEL DEVELOPER FORUM, Berlin, April 28, 2003 -- Intel Corporation researchers have released software under an open-source license that allows developers to build computers that see and “read lips” the way humans do to better understand spoken commands.

Today’s powerful speech recognition algorithms work well when background noise is eliminated or a well-tuned headset is used, but their accuracy rapidly degrades when applications have to cope with naturally noisy environments, such as public places. Combined with face detection algorithms from Intel’s OpenCV computer vision library, Audio Visual Speech Recognition (AVSR) software enables computers to detect a speaker’s face and track their mouth movements. Synchronizing video data with speech identification enables much more accurate speech recognition, enhancing a wide variety of speech applications in noisy environments. The AVSR software is part of Intel’s OpenCV computer vision library, a toolbox of more than 500 imaging functions that helps researchers develop computer vision applications.

“Intel wants to develop technology that allows computers to naturally interact with the world the way humans do. Human recognition is seldom based on a single type of information. We make decisions by combining information from a variety of sources,” said Justin Rattner, Intel senior fellow, Enterprise Platform Group and director of Intel’s Microprocessor Research Labs. “The addition of Audio/Video Speech Recognition code to Intel’s OpenCV library is certain to drive research and development in vision-assisted speech recognition.”

Accelerating Research Into New Uses
Faster microprocessors, falling camera prices and ten times more video capture bandwidth from technologies like USB2 are all enabling real-time computer vision algorithms to run on mainstream PCs. OpenCV is designed to increase innovation in this field by providing source code for a wide range of computer vision and imaging functions. Since its release in 2000, OpenCV has seen over 500,000 downloads of code and has attracted more than 5,000 registered members to its user group.

Developers are using OpenCV code in applications ranging from toys to industrial manufacturing. The software includes C source code for all of the library’s functionality and a royalty-free redistribution license. Information about AVSR can be found at http://www.intel.com/research/mrl/research/avcsr.htm. The OpenCV web site is located at http://www.intel.com/research/mrl/research/opencv/. Individuals interested in joining the user group can register at groups.yahoo.com and then can subscribe by sending email to OpenCV at subscribe@yahoogroups.com.

Intel has developed a uniquely decentralized research model with more than 70 labs located around the world. The majority of the AVSR software team resides at Intel China Research Center in Beijing, China. Established in 1998, the center currently employs more than 40 computer research scientists and engineers working in research areas such as computer vision, media, Bayesian networks, compilers and tools.

INTEL RESEARCHERS TEACH COMPUTERS TO ‘READ LIPS’ TO IMPROVE ACCURACY OF SPEECH RECOGNITION SOFTWARE

Related Reading

MY HEXUS

EVENTS

INDUSTRY PRESS RELEASES