10-09-2014, 02:34 PM
A NOVEL METHOD OF COMPRESSING SPEECH WITH HIGHER BANDWIDTH EFFICIENCY
A NOVEL METHOD.doc (Size: 152.5 KB / Downloads: 11)
Abstract:
This paper illustrates a novel method of speech compression and transmission. This method saves the transmission bandwidth required for the speech signal by a considerable amount. This scheme exploits the property of low pass nature of the speech signal. Also this method applies equally well for any signal, which is low pass in nature, speech being the more widely used in Real Time Communication, is highlighted here.
As per this method, the low pass signal (speech) at the transmitter is divided into set of packets, each containing, say N number of samples. Of the N samples per packet, only certain lesser number of samples, say N alone are transmitted. Here is less than unity, so compression is achieved. The N samples per packet are subjected to a N-Point DFT. Since low pass signals alone are considered here, the number of significant values in the set of DFT samples is very limited. Transmitting these significant samples alone would suffice for reliable transmission. The number of samples, which are transmitted, is determined by the parameter .
The parameter is almost independent of the source of the speech signal. In other methods of speech compression, the specific characteristics of the source such as pitch are important for the algorithm to work.
An exact reverse process at the receiver reconstructs the samples. At the receiver, the N-point IDFT of the received signal is performed after necessary zero padding. Zero padding is necessary because at the transmitter of the N samples only N samples are transmitted, but at the receiver N samples are again needed to honestly reconstruct the signal.
Hence this method is efficient as only a portion of the total number of samples is transmitted thereby saving the bandwidth. Since the frequency samples are transmitted the phase information has also to be transmitted. Here again by exploiting the property of signals and their spectra that the PHASE INFORMATION CAN BE EMBEDDED WITHIN THE MAGNITUDE SPECTRUM by using simple mathematics without any heavy computations or by increasing the bandwidth.
Also the simulation result of this method shows that smaller the size of the packet the more faithful is the reproduction of received signal that is again an advantage as the computation time is reduced. The reduction in the computation time is due to the fact that the transmitter has to wait until N samples are obtained before starting the transmission. If N is small, the transmitter has to wait for a less duration of time and a smaller value of N achieves a better reconstruction at the receiver.
Thus this scheme provides a more efficient method of speech compression and this scheme is also very easy to implement with the help of available high-speed processors.
INTRODUCTION:
Today, rapid speech transmission has become critical in many applications. With more quality being demanded by the end-user, and an increase in bandwidth usage, the delivery of audio and allied applications on demand cannot be left behind.
In this paper, we wish to present a new algorithm for speech compression using the frequency domain approach.
The same method has also been used in the compression of static images also.
To transmit a speech signal digitally, we have a lot of schemes.
• Sampling the signal in time domain.(PCM,DPCM,ADPCM,DM)
• Dividing the signal into number of sub-bands and encoding them separately (Adaptive sub-band coding)
• Encoding information about how the speech signal was produced by the human vocal system (Vocoders, RELP, CELP, LPC)
We are trying to introduce another scheme that utilizes the properties of speech signals and transmits at a lower bit rate and reconstructs the signal back with less distortion.
ADVANTAGES:
The above method is more advantageous because of reduction of transmission bandwidth.
• Since only ‘αN’ samples are transmitted, the minimum required Bandwidth (Nyquist band width) is reduced by a factor 1/α .
• Also since ‘N’ is less, this reduces the computation time of the FFT and hence the successive samples need not be queued in a buffer (memory) by making computing time (O [log (N)]) less than ‘N times sampling period’. The computation of N-point DFT can be implemented with high-speed processors with very less time delay.
• This method does not require any computations with the adjacent samples to make any decision except to simply collect the samples and compute the Fourier transform. Because of this it can be implemented in real time without any time delay between adjacent packets.
• This method of speech compression is speaker independent. Hence it does not require any speaker model or the psychoacoustic model of the ear to make any decision thereby making the method very simple.
CONCLUSION:
Given the conceptual ease of understanding and design, as well as the many advantages listed above, we are bound to conclude that the new algorithm lends itself to use directly. Also, the universal applicability (the same has been tried out successfully on static images) of the same makes it furthermore appealing.