14-07-2014, 10:55 AM
Voice Verification System Using Wavelets
Voice Verification System Using Wavelets (2).pdf (Size: 652.53 KB / Downloads: 19)
Abstract
This paper presents a novel voice verification system using wavelet transforms. The conventional signal processing techniques assume the signal to be stationary and are ineffective in recognizing non stationary signals such as the voice signals. Voice signals which are more dynamic could be analyzed with far better accuracy using wavelet transform. The developed voice recognition system is word dependant voice verification system combining the RASTA and LPC. The voice signal is filtered using the special purpose voice signal filter using the Relative Spectral Algorithm (RASTA). The signals are denoised and decomposed to derive the wavelet coefficients and thereby a statistical computation is carried out. Further the formant or the resonance of the voices signal is detected using the Linear Predictive Coding (LPC). With the statistical computation on the coefficients alone, the accuracy of the verifying sample individual voice to his own voice is quite high (around 75% to 80%). The reliability of the signal verification is strengthened by combining entailments from these two completely different aspects of the individual voice. For voice comparison purposes four out five individuals are verified and the results show higher percentage of accuracy. The accuracy of the system can be improved by incorporating advanced pattern recognition techniques such as Hidden Markov Model (HMM).
INTRODUCTION
Speech is a very basic way for humans to convey information to one another. With a bandwidth of only 4 kHz, speech can convey information with the emotion of a human voice. People want to be able to hear someone’s voice from anywhere in the world. As if the person was in the same room. As a result a greater emphasis is being placed on the design of new and efficient speech coders for voice communication and transmission.
Today applications of speech coding and compression have become very numerous. Many applications involve the real time coding of speech signals, for use in mobile satellite communications, cellular telephony, and audio for videophones or video teleconferencing systems. Other applications include the storage of speech for speech synthesis and playback, or for the transmission of voice at a later time. Some examples include voice mail systems, voice memo wristwatches, voice logging recorders and interactive PC software. Traditionally speech coders can be classified into two categories: waveform coders and analysis/synthesis vocoders (from .voice coders.). Waveform coders attempt to copy the actual shape of the signal produced by the microphone and its associated analogue circuits a popular waveform coding technique is pulse code modulation (PCM),
WAVELETS
The fundamental idea behind wavelets is to analyse according to scale. The wavelet analysis procedure is to adopt a wavelet prototype function called an analysing wavelet or mother wavelet. Any signal can then be represented by translated and scaled versions of the mother wavelet. Wavelet analysis is capable of revealing aspects of data that other signal analysis techniques such as Fourier analysis miss aspects like trends, breakdown points, discontinuities in higher derivatives, and self-similarity. Furthermore, because it affords a different view of data than those presented by traditional techniques, it can compress or de-noise a signal without appreciable degradation
VOICE SIGNAL ANALYSIS
RASTA or Relative Spectral Algorithm as it is known is a technique that is developed as the initial stage for voice recognition. This method works by applying a band-pass filter to the energy in each frequency sub-band in order to smooth over short-term noise variations and to remove any constant offset. In voice signals, stationary noises are often detected. Stationary noises are noises that are present for the full period of a certain signal and does not have diminishing feature . Their property does not change over time. The assumption that needs to be made is that the noise varies slowly with respect to speech. This makes the RASTA a perfect tool to be included in the initial stages of voice signal filtering to remove stationary noises . The stationary noises that are identified are noises in the frequency range of 1Hz - 100Hz.
SYSTEM IMPLEMENTATION
In order to implement the system, a certain methodology is implemented by decomposing the voice signal to its approximation and detail. From the approximation and detail coefficients that are extracted, the methodology is implemented in order to carry out the recognition.
The statistical calculations that are carried out are mean, standard deviation, variance and mean of absolute deviation. The wavelet that is used for the system is the symlet 7 wavelet as that this wavelet has a very close correlation with the voice signal. This is determined through numerous trial and errors. The coefficients that are extracted from the wavelet decomposition process is the second level coefficients as the level two coefficients contain most of the correlated data of the voice signal. The data at higher levels contains very little amount of data deeming it unusable for the recognition phase. Hence for initial system implementation, the level two coefficients are used. The coefficients are further threshold to remove the low correlation values, and using this coefficients statistical computation
CONCLUSION
The Voice Recognition Using Wavelet Feature Extraction employ wavelets in voice recognition for studying the dynamic properties and characteristics of the voice signal. This is carried out by estimating the formant and detecting the pitch of the voice signal by using LPC (Linear Predictive Coding). The voice recognition system that is developed is word dependant voice verification system used to verify the identity of an individual based on their own voice signal using the statistical computation, formant estimation and wavelet energy. A GUI is built to enable the user to have an easier approach in observing the step-by-step process that takes place in Wavelet Transform. By using the fifty preloaded voice signals from five individuals, the verification tests have been carried and an accuracy rate of approximately 80 % has been achieved. The system can be enhanced further by using advanced pattern recognition techniques such as Neural Network or Hidden Markov Model (HMM) for more accurate and efficient system.