Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: : P.GANESHAN RESUME
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

project maker

Artificial Intelligence For Speech Recognition

[attachment=66747]

(Abstract)

Artificial Intelligence is behaviour of a machine, which is performed by a human being, would be called intelligent. It makes machines smarter and more useful, and is less expensive than natural intelligence. Natural language processing (NLP) refers to artificial intelligence methods of communicating with a computer in a natural language like English. The main objective of a NLP program is to understand input and initiate action.
Artificial Intelligence for speech Recognition is the science and engineering of making intelligent machines, especially intelligent computer programs. Speech Recognition (is also known as Automatic Speech Recognition (ASR), or computer speech recognition) is the process of converting a speech signal to a sequence of words, by means of an algorithm implemented as a computer program. Speech recognition technology has made it possible for computer to follow human voice commands and understand human languages. The main goal of speech recognition area is to develop techniques and systems for speech input to machine.
Speech is the primary means of communication between humans.
The design of Speech Recognition system requires careful attentions to the following issues: Definition of various types of speech classes, speech representation, feature extraction techniques, speech classifiers, database and performance evaluation. The problems that are existing in ASR and the various techniques to solve these problems constructed by various research workers have been presented in a chronological order. Hence authors hope that this work shall be a contribution in the area of speech recognition. The objective of this review paper is to summarize and compare some of the well known methods used in various stages of speech recognition system and identify research topic and applications which are at the forefront of this exciting and challenging field.




Introduction:

Artificial intelligence involves two basic ideas. First, it involves studying the thought processes of human beings. Second, it deals with representing those processes via machines (like computers, robots, etc.).
Speech recognition is the one of the subsets in the artificial intelligence field that translates the knowledge representation from the AI artificial intelligent brain to communicate with the user. It is also known as the translation of the speech or spoken words into the text. Speech recognition includes the dialog, semantics, syntax, lexicon, morphology, and phonetics to accept the input data into the knowledge representation of the AI system. Majority of the current AI speech recognition is based on Hidden Markov Models which was done by Leonard Baum at Princeton in the late 1960s (Julia Hirschberg). Speech recognition is also can be defined as translating an acoustic waveform from human speech to the textual data. This task is complicated as it contains intonation, accent, and variations which are difficult to handle in discrete system. (Prof. Todd Austin, 2012). Hidden Markov Model is the statistical model that queue up the processes which is interrelated to one process to another process. One process from another process is observed with the chain of sequences and it contains the probability, calculated as a model. From this model, it can approach another better model as redundant calculations are calculated with the several equations (Phil Blunsom, 2004) (Juang and Rabiner, 2004).


Speaker independency:

The speech quality varies from person to person. It is therefore difficult to build an electronic system that recognizes everyone’s voice. By limiting the system to the voice of a single person, the system becomes not only simpler but also more reliable. The computer must be trained to the voice of that particular individual. Such a system is called speaker-dependent system.
Speaker independent systems can be used by anybody, and can recognize any voice, even though the characteristics vary widely from one speaker to another. Most of these systems are costly and complex. Also, these have very limited vocabularies.
It is important to consider the environment in which the speech recognition system has to work. The grammar used by the speaker and accepted by the system, noise level, noise type , position of the microphone, and speed and manner of the user’s speech are some factors that may affect the quality of speech recognition.


. Environmental influence:

Real applications demand that the performance of the recognition system be unaffected by changes in the environment. However, it is a fact that when a system is trained and tested under different conditions, the recognition rate drops unacceptably. We need to be concerned about the variability present when different microphones are used in training and testing, and specifically during development of procedures. Such care can significantly improve the accuracy of recognition systems that use desktop microphones.
Acoustical distortions can degrade the accuracy of recognition systems. Obstacles to robustness include additive noise from machinery, competing talkers, reverberation from surface reflections in a room, and spectral shaping by microphones and the vocal tracts of individual speakers. These sources of distortions fall into two complementary classes; additive noise and distortions resulting from the convolution of the speech signal with an unknown linear system.
Although relatively successful, all these methods depend on the assumption of independence of the spectral estimates across frequencies. Improved performance can be got with an MMSE estimator in which correlation among frequencies is modeled explicitly.



. Potential and Limitation in Speech Recognition :

In the human real world, human to human communication is a good effective interaction as there is no limitation in the native speaking people except the language barrier with non-native speaking people unless they can be master in that particular language. But in the AI system, speech recognition has many limitations when there is applied in human and computer interaction. Speech is the wavelength of the audio data that is determined to record as textual data. Speech recognition process is useful in hands-busy, eyes-busy services as telephone-based services. Data input should be accurate enough for the speech recognition system to understand the data. Some of the limitations the user input data is not accurate, the system has the problem as the execution power of the microprocessor, its programs cannot determine the user input data or user voice command, the AI knowledge representation has few vocabulary as its vocabulary bank has around 200, 000 words, speech recognition design problem and wrong algorithms as it cannot accept, understand, interpret and respond back to the user voice command. Another limitation is as the system cannot understand the noise of voice, for e.g. after we exercise jogging, the human body is tired and the user voice includes different voice based on the tiredness of the body, and the system may not understand human voice command (Ben Shneiderman, 2000) (Ira Greenberg and Andrew Bate, 1999) (Young and Mihailidis, n.d) (Bilmes, 1999) (Rabiner, 1997).There is one main limitation in speech recognition. It is the system understanding of the acoustic memory and appreciating prosody. The system has to access the knowledge representation memory, load the data and return back the action done by the user. One of the main logic in the speech recognition is speech is slow for presenting the information and it interferes with cognitive thoughts or tasks. This logic implementation is difficult to implement in the system as the system is difficult to review or edit the user input command. The system faces several problems when it has to interact with the human acoustic memory and processing. A human can input the textual data or move the mouse pointer to give the command the system to make function but the system faces the problem when it has to build the cognitive model (Ben Shneiderman, 2005)




Conclusion:

Speech recognition is the wide research area to be developed in both academic and business areas. Currently there are some limitations in the speech recognition in neural network. Currently in the speech recognition development, there are several speech API for the development to be used. In the neural network, each node contains the details and descriptions for the output node to be processed. It means the system can have the incomplete data in the hidden node and each node may not process well due to the poor implementation. Currently recent development in Google Inc as Google is making the Android based voice recognition system for the users to use voice data and make the functions or tasks based on the user input. Speech recognition is the long term development and research is a wide area to be developed. Accuracy is the main important factor in the input data from the user to interact with the system. In the hidden node, knowledge representation of the speech recognition is important as the implementation of the algorithms must be accurate enough to output the audio data.

By using this speaker recognition technology we can achieve many uses. This technology helps physically challenged skilled persons. These people can do their works by using this technology without pushing any buttons. This ASR technology is also used in military weapons and in Research canters. Now a day this technology was also used by CID officers. They used this to trap the criminal activities.