26-11-2012, 04:01 PM
Overview of Speech Recognition Technology
OBJECTIVE AND SCOPE OF PROJECT.doc (Size: 1.18 MB / Downloads: 37)
This thesis report considers an overview of speech recognition technology, software development, and its applications. The first section deals with the description of speech recognition process, its applications in different sectors, its flaws and finally the future of technology. Later part of report covers the speech recognition process, and the code for the software and its working. Finally the report concludes at the different potentials uses of the application and further improvements and considerations.
Project Objective
To understand the speech recognition and its fundamentals. Its working and applications in different areas Its implementation as a desktop Application
Development for software that can mainly be used for:
Speech Recognition
Speech Generation
Text Editing
Tool for operating Machine through voice.
Project scope
This project has the speech recognizing and speech synthesizing capabilities though it is not a complete replacement of what we call a NOTEPAD but still a good text editor to be used through voice. This software also can open windows based softwares such as Notepad, Ms-paint and more.
An overview of Speech Recognition
Speech recognition is a technology that able a computer to capture the words spoken by a human with a help of microphone . These words are later on recognized by speech recognizer, and in the end, system outputs the recognized words. The process of speech recognition consists of different steps that will be discussed in the following sections one by one.
Speech recognition is a technology that able a computer to capture the words spoken by a human with a help of microphone . These words are later on recognized by speech recognizer, and in the end, system outputs the recognized words. The process of speech recognition consists of different steps that will be discussed in the following sections one by one.
History
The concept of speech recognition started somewhere in 1940s , practically the first speech recognition program was appeared in 1952 at the bell labs, that was about recognition of a digit in a noise free environment .
• 1940s and 1950s consider as the foundational period of the speech recognition technology, in this period work was done on the foundational paradigms of the speech recognition that is automation and information theoretic models .
• In the 1960’s we were able to recognize small vocabularies (order of 10-100 words) of isolated words, based on simple acoustic-phonetic properties of speech sounds. The key technologies that were developed during this decade were, filter banks and time normalization methods .
• In 1970s the medium vocabularies (order of 100-1000 words) using simple template-based, pattern recognition methods were recognized.
• In 1980s large vocabularies (1000-unlimited) were used and speech recognition
problems based on statistical, with a large range of networks for handling
language structures were addressed. The key invention of this era were hidden markov model (HMM) and the stochastic language model, which together enabled powerful new methods for handling continuous speech recognition problem efficiently and with high performance.
In 1990s the key technologies developed during this period were the methods for stochastic language understanding, statistical learning of acoustic and language models, and the methods for implementation of large vocabulary speech understanding systems. After the five decades of research, the speech recognition technology has finally entered marketplace, benefiting the users in variety of ways. The challenge of designing a machine that truly functions like an intelligent human is still a major one going forward.
Types of speech recognition
Speech recognition systems can be divided into the number of classes based on their ability to recognize that words and list of words they have. A few classes of speech recognition are classified as under:
Isolated Speech
Isolated words usually involve a pause between two utterances; it doesn’t mean that it only accepts a single word but instead it requires one utterance at a time .
Connected Speech
Connected words or connected speech is similar to isolated speech but allow separate utterances with minimal pause between them.
Continuous speech
Continuous speech allow the user to speak almost naturally, it is also called the computer dictation.
Spontaneous Speech
At a basic level, it can be thought of as speech that is natural sounding and not rehearsed. An ASR system with spontaneous speech ability should be able to handle a variety of natural speech features such as words being run together, "ums" and "ahs", and even slight stutters.