02-03-2013, 12:03 PM
Rethinking of computation for future-generation, knowledge-rich speech recognition and understanding
Abstract
A new trend is emerging in the semiconductor industry that future computation speedups will likely come more from parallelism than from having faster individual computing elements. Most algorithm designers for the current, HMM-based speech recognition systems, which have the recognition performance significantly lower than that of human, have not embraced this trend. This is partly attributed to the state-of-the-art sequential algorithms that have involved extremely clever schemes to speed up single-processor performance developed and matured over many years. This invited presentation advances two arguments. First, much more powerful speech systems in the future generations will likely approach human performance with new architectures that integrate rich knowledge sources and overcome the reasonably well understood limitations of the current HMM-based systems. Second, the success of the above endeavor will require complete rethinking of computation issues, likely disposing of the traditional thinking of HMM-centric sequential processing and embracing parallel computing in the new architectures mimicking key aspects of the human speech processing system. Four case studies are provided in this paper extracted from some recent influential work that may shape the foundation of this potentially active research area.
SCOPE
This guide reviews the literature in the collections of the Library of Congress on speech recognition. Speech recognition is a process by which the elements of spoken language can be recognized and analyzed, and the linguistic message it contains transposed into a meaningful form so that a machine can respond correctly to spoken commands. The earliest attempts to devise systems for automatic speech recognition by machine were made in the 1950s. Today, speech recognition research is interdisciplinary, drawing upon work in fields as diverse as biology, computer science, electrical engineering, linguistics, mathematics, physics, and psychology. Within these disciplines, pertinent work is being done in the areas of acoustics, artificial intelligence, computer algorithms, information theory, linear algebra, linear system theory, pattern recognition, phonetics, physiology, probability theory, signal processing, and syntactic theory.
Applications of speech recognition have been made in office or business systems, as well as in manufacturing, medicine, and telecommunications, and usually concern the recognition and retrieval of information, such as voice- activated data entry; the control, operation, and monitoring of various machines and devices; call-processing functions; and the automation of services normally requiring human beings. While speech recognition has many short-term applications, it also has the potential to change daily life profoundly as free communication between man and machine becomes a reality.