18-06-2014, 02:05 PM
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition.ppt (Size: 288.5 KB / Downloads: 12)
Artificial Intelligence (or AI )
Definition:- The study and design of intelligent agents & also used to describe a property of machines or programs
Among researchers hope machines will exhibit are reasoning, knowledge, planning, learning, communication, perception and the ability to move and manipulate.
Speech Recognition
Speech recognition converts spoken words to machine-readable input. It is also called Voice Recognition
Speech recognition includes-
voice dialing
content-based spoken audio search
speech-to-text processing
Audio visual Speech Recognition is also present in which it takes lip reading also apart from speech recognition.
Speech Recognition in Cellphones
Callers words are captured and digitized by speech-recognition system.
Digitized voice is split into individual frequency components, called spectral representations.
The components are translated into phonemes.
Complex models and algorithms determine a likely translation
Performance of speech recognition systems
It is usually specified in terms of accuracy and speed. Accuracy may be measured in terms of performance accuracy which is usually rated with word error rate , whereas speed is measured with the real time factor.
Dictation machines can achieve very high performance in controlled conditions and require only a short period of training
HMM - based Speech Recognition
These are statistical models which output a sequence of symbols or quantities
Two reasons why HMMs are mainly used and popular-
Speech signal could be veiwed as a piecewise stationary signal.
They can be trained automatically , simple and computationally feasible to use
DTW - based Speech Recognition
Dynamic time warping is an algorithm for measuring similarity between two sequences which may vary in time or speed. It is a historical approach.
Similarities between speaking patterns would be detected. DTW has been applied to video, audio, and graphics -- indeed, any data which can be turned into a linear representation can be analyzed with DTW.
Failures of Speech Recognition
The computer has trouble with "sound-alike" errors. It's hard to get mad at the computer for not recognizing mumbling. But it can be frustrating when you think you are speaking clearly, and it just isn't good enough.
For example, when I said: I sure look forward to seeing you
The computer heard: Assure look forward to seen in you
When I repeated the same words with better enunciation, the computer got it right.
Conclusion
This paper presents the Speech Recognition in Artificial intelligence systems and it is important to consider the environment in which the speech recognition system has to work.
The grammar used by the speaker and accepted by the system, noise level, noise type, position of the microphone, and speed and manner of the user’s speech are some factors that may affect the quality of speech recognition