13-04-2012, 10:57 AM
Design of a Speaker Recognition Code using MATLAB
111.pdf (Size: 236.54 KB / Downloads: 263)
INTRODUCTION
Development of speaker identification systems began as early as the 1960s with
exploration into voiceprint analysis, where characteristics of an individual’s voice were
thought to be able to characterize the uniqueness of an individual much like a fingerprint.
The early systems had many flaws and research ensued to derive a more reliable method
of predicting the correlation between two sets of speech utterances.
APPROACH
This multi faceted design project can be categorized into different sections:
speech editing, speech degradation, speech enhancement, pitch analysis, formant analysis
and waveform comparison. The resulting discussion will be segmented based on these
delineations.
SPEECH DEGRADATION
The file recorded with my faster speech (a18.wav) was found from the ordered list
of speakers. Speech degradation was performed by adding Gaussian noise generated by
the MATLAB function randn() to this file.
SPEECH ENHANCEMENT
The file recorded with my slower speech and noise in the background (a71.wav)
was found from the ordered list of speakers. A plot of this file is shown in Figure (2).
This signal was then converted to the frequency domain through the use of a shifted FFT
and correctly scaled frequency vector.
WAVEFORM COMPARISON
Using the results and information learned from pitch and formant analysis, a
waveform comparison code was written. Speech waveform files can be characterized
based on various criteria.
RESULTS
Results of speech editing are shown in Figure (5). As can be seen, the phrase “ECE-
310,” the second half of the first plot, has clearly been moved to the front of the
waveform in the second plot.