21-12-2012, 06:42 PM
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION
1MODULATION SPECTRUM.ppt (Size: 1,012 KB / Downloads: 28)
Introduction
The performance of speech recognition systems is very often degraded due to the mismatch between the acoustic conditions of the training and testing environments.
In this paper, we propose a new approach for modulation spectrum equalization in which the modulation spectra of noisy speech utterances are equalized to those of clean speech.
The first is to equalize the cumulative density functions (CDFs) of the modulation spectra of clean and noisy speech, such that the differences between them are reduced.
The second is to equalize the magnitude ratio of lower to higher components in the modulation spectrum.
Spectral Histogram Equalization
We first calculate the cumulative distribution function (CDF) of the magnitudes of the modulation spectra, , for all utterances in the clean training data of AURORA 2 to be used as the reference CDF,
For any test utterance, the CDF for its modulation spectrum magnitude, , can be similarly obtained as
Magnitude Ratio
We can observe from this figure that the mean value of is degraded when SNR is degraded, and thus is highly correlated with SNR.
It is therefore reasonable to equalize the value of for a noisy utterance to a reference
value obtained from clean training data.
Magnitude Ratio Equalization
We first calculate the average of for all utterances in the clean training data of AURORA 2 as the reference value .
We then calculate the value of for each test utterance as