04-05-2013, 04:04 PM
Speech Compression and Enhancement using Wavelet Coders
Speech Compression.pdf (Size: 501.59 KB / Downloads: 68)
Abstract
In the field of communication system there is
always a trade-off between compression ratio (CR) and
quality of signal when it comes to the lossy compression
technique. In digital telephony system speech compression
and high-quality of speech is crucial, because with a very
limited bandwidth, service is provided with a huge number of
users, so to improve the efficiency of bandwidth and for
higher SNR an effective speech coder is required. In recent
years discrete wavelet transform (DWT) emerged as a very
effective technique for image and speech signal analysis. This
paper presents wavelet compression technique by which the
ratio of compression and signal to noise of a speech signal can
be balanced. The results have been simulated on Mat Lab
toolbox 7.10, clearly displays the values of SNR and CR for
male and female speech signals using Haar and db6 wavelet.
INTRODUCTION
Speech coding has always been and still a major issue in
the terms of maintaining a ratio between signal to noise
and compression. Discrete wavelet transform has been
used successfully for image compression but a very low
attention has been paid for implementing this technique in
the field of speech coding. In telecommunication field,
speech compression is an issue directly related to the
service provider but not a direct problem of users, users
want higher SNR. Linear predictive coding is mostly used
for speech coding that yields higher compression ratio
(fixed) but by using wavelet technique we get higher SNR
than LPC and a variation in compression ratio. Speech
compression using discrete wavelet transform [1] will be
very useful in military applications, digital cellular
telephony system because there is limited bandwidth
(lower bit rate is required) and higher speech quality is
essential, especially in military purpose. This technique is
very much useful in video-conferencing [9] where a wide
bandwidth is already occupied by video processing itself.
There are many other applications like storage of speech
signal and transmission of voice at a later time.
DISCRETEWAVELET TRANSFORM
Wavelet [1] is a new technique for analyzing and
compressing a speech signal, it is more advantageous
technique because it holds both time and frequency aspect
of the signal and have localize analysis of a larger signal.
The basic concept behind wavelet is to analyze a signal
according to the scale. The first essential thing is to choose
a mother wavelet then any signal can be represented by its
translated and scaled version. Wavelet breaks speech
signal into few coefficients, so when we take wavelet
transform of it, some of the coefficients becomes zero or
have very small value. Data compression is done by
treating small valued coefficients [12] insignificant and
discarding them.
Discrete wavelet transform breaks the signal into high
frequency and low frequency components. The output of
high pass filter is known as detail coefficients and the
output of low pass filter is approximation coefficients.
Approximation coefficients are high scaled low frequency
components while detail coefficients are low scaled high
frequency components.
CHOICE OFWAVELET
The choice of wavelet is very important in wavelet
compression technique, when it is used to design high
quality speech coder. We cannot consider any wavelet
better than other one because every wavelet is unique and
has its own significance. Different kind of wavelet make
trade-offs between the compact model of basis functions
in space and how smooth they are. Mostly wavelets are
classified by its vanishing moment and number of the
vanishing moments is important for wavelet speech
compressor. It is not practically implemented for real time
applications with high number of vanishing moments [9]
because the computational complexity of discrete wavelet
transform is proportional to the number of vanishing
moments.
With speech compression our objective is to improve
SNR also, so wavelet can be selected on the basis of
energy conservation properties in approximation
coefficients. By using Daubechies D20, D12, D10 or D8
wavelets, 96% of the signal energy, level 1 approximation
coefficients contains. The even numbers with D displays
the number of coefficients and vanishing moment that are
half of the number of coefficients. In this paper we have
tested speech signal with Haar wavelet and db6 which
contains 6 coefficients and 3 vanishing moments.
THRESHOLDING OF COEFFICIENTS
When the signal get decomposed into approximation
and detail coefficients, then we truncate the small valued
coefficients considering it to non- essential part of the
signal. For truncating the coefficients there are two types
of thresholding- hard and soft thresholding. Hard
thresholding is used to compress the signal while soft
thresholding is used to de-noise the speech signal. A Mat
lab function wdencmp is used to wavelet coefficients
thresholding for both compression and de-noising.
wdencmp enables us to choose whether it is global
thresholding or level-dependent thresholding. Leveldependent
thresholds are calculated using Brige-Massart
strategy [3]. According to this strategy all the
approximation coefficients are kept at the level of
decomposition j. The numbers of detail coefficients that
are to be kept at level i from 1 to j are given by the
formula.
HUFFMAN ENCODING
In quantization process, the quantized data contains
some repeated data and that would be wastage of memory.
Huffman coding is done to overcome this problem [5].
After Huffman coding, we get true compressed signal.
CONCLUSION
Compression of speech signal using wavelets enables us
to vary the compression ratio as per our requirement. This
paper justifies the inversely proportional relation between
CR and SNR and applied discrete wavelet transform
technique to maintain that trade-off. In this paper we used
‘Haar’ and ‘db6’ wavelets to decompose the signal at level
6 of two speech signals spoken by a male and a female.
The compressed and de-noised signals shown upper
clearly displays the difference with original speech signal
while compression and de-noising performances of
wavelet technique is shown in resulted table.