21-08-2014, 04:43 PM
The JPEG Standard Seminar Report
The JPEG Standard.doc (Size: 927.5 KB / Downloads: 49)
Abstract
JPEG (Joint Photographic Experts Group) is an international compression standard for continuous-tone still image, both grayscale and color. This standard is designed to support a wide variety of applications for continuous-tone images. Because of the distinct requirement for each of the applications, the JPEG standard have two basic compression methods. The DCT-based mathod is specified for lossy compression, and the predictive mathod is specified for lossless compression. A simple lossy technique called baseline, which is a DCT-based methods, has been widely used today and is sufficient for a large number of applications. In this paper, we will simply introduce the JPEG standard and focuses on the baseline method
Introduction
The JPEG standard is a collaboration among the International Telecommunication Union (ITU), International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC). Its official name is "ISO/IEC 10918-1 Digital compression and coding of continuous-tone still image", and "ITU-T Recommendation T.81". JPEG have the following modes of operations :
(a) Lossless mode: The image is encoded to guarantee exact recovery of every pixel of original image even though the compression ratio is lower than the lossy modes.
(b) Sequential mode: It compresses the image in a single left-to-right, top-to-bottom scan.
© Progressive mode: It compresses the image in multiple scans. When transmission time is long, the image will display from indistinct to clear appearance.
(d) Hierarchical mode: Compress the image at multiple resolutions so that the lower resolution of the image can be accessed first without decompressing the whole resolution of the image.
The last three DCT-based modes (b, c, and d) are lossy compression because precision limitation to compute DCT and the quantization process introduce distortion in the reconstructed image. The lossless mode uses predictive method and does not have quantization process. The hierarchical mode can use DCT-based coding or pridictive coding optionally. The most widely used mode in practice is is called the baseline JPEG system, which is based on sequential mode, DCT-based coding and Huffman coding for entropy encoding. Fig. 1 is the block diagram of baseline system.
The JPEG standard defines only the syntax of the compressed bitstream. It does not specify any thing about file format. Another standard called JFIF (JPEG File Interchange Format), created by IJG (Independend JPEG Group), make a description about how to transform a JPEG stream to a file that is suit to be saved or transmission in computer.
2 Color Space Conversion and Downsampling
In order to achieve good compression performance, correlation between the color components is first reduced by converting the RGB color space into a decorrelated color space. In baseline JPEG, a RGB image is first transformed into a luminance-chrominancc color space such as YCbCr. The advantage of converting the image into luminance-chrominance color space is that the luminance and chrominance components are very much decorrelated between each other. Moreover, the chrominance channels contain much redundant information and can easily be subsampled without sacrificing any visual quality for the reconstructed image. The transformation from RGB to YCbCr, is based on the following mathematical expression:
The value Y = 0.299R + 0.587G + 0.114B is called the luminance. It is the value used by monochrome monitors to represent an RGB colour. Physiologically, it represents the intensity of an RGB color perceived by the eye. The formula is like a weighted-filter with different weights for each spectral component. The eye is most sensitive to the Green component then it follows the Red component and the last is the Blue component. The values Cb and Cr are called chromimance values and represent 2 coordinates in a system which measures the nuance and saturation of the color. These values indicate how much blue and how much red are in that color, respectively. Accordingly, the inverse transformation from YCbCr to RGB
3 Discrete Cosine Transform
To apply the DCT, the image is divided into 88 blocks of pixels. If the width or height of the original image is not divisible by 8, the encoder should make it divisible. The 88 blocks are processed from left-to-right and from top-to-bottom.
The purpose of the DCT is to transform the value of pixels to the spatial frequencies. These spatial frequencies are very related to the level of detail present in an image. High spatial frequencies corresponds to high levels of detail, while lower frequencies corresponds to lower levels of detail. The mathematical definition of DCT is :
4 Quantization
The transformed 88 block now consists of 64 DCT coefficients. The first coefficient F(0,0) is the DC component and the other 63 coefficients are AC component. The DC component F(0,0) is essentially the sum of the 64 pixels in the input 88 pixel block multiplied by the scaling factor (1/4)C(0)C(0)=1/8 as shown in equation 3 for F(u,v).
The next step in the compression process is to quantize the transformed coefficients. Each of the 64 DCT coefficients are uniformly quantized. The 64 quantization step-size parameters for uniform quantization of the 64 DCT coefficients form an 88 quantization matrix. Each element in the quantization matrix is an integer between 1 and 255. Each DCT coefficient F(u,v) is divided by the corresponding quantizer step-size parameter Q(u,v) in the quantization matrix and rounded to the nearest integer as :
6 Zero Run Length Coding of AC Coefficient
Now we have the quantized vector with a lot of consecutive zeroes. We can exploit this by run length coding of the consecutive zeroes. Let's consider the 63 AC coefficients in the original 64 quantized vectors first. For example, we have :
57, 45, 0, 0, 0, 0, 23, 0, -30, -16, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,..., 0
We encode for each value which is not 0, than add the number of consecutive zeroes preceding that value in front of it. The RLC (run length coding) is :
(0,57) ; (0,45) ; (4,23) ; (1,-30) ; (0,-16) ; (2,1) ; EOB
7 Difference Coding of DC Coefficients
Because the DC coefficients contains a lot of energy, it usually has much larger value than AC coefficients, and we can notice that there is a very close connection between the DC coefficients of adjacent blocks. So, the JPEG standard encode the difference between the DC coefficients of consecutive 88 blocks rather than its true value. The mathematical represent of the difference is :
Diffi = DCi DCi-1 (8)
and we set DC0 = 0. DC of the current block DCi will be equal to DCi-1 + Diffi . So, in the JPEG file, the first coefficient is actually the difference of DCs as shown in Fig. 8. Then the difference is Huffman encoded together with the encoding of AC coefficients.
Conclusions
We have introduced the basic compression methods of JPEG standard. Although this standard has become the most popular image format, it still has some properties to improvement. For example, the new JPEG 2000 standard use wavelet-based compression method, and it can operate at higher compression ratio without generating the characteristic 'blocky and blurry' artifacts of the original DCT-based JPEG standard.