Analysis and Comparison of Two Important Compression Methods

**seminar ideas** · 08-05-2012, 01:39 PM

Analysis and Comparison of Two Important Compression Methods

.doc

2.doc (Size: 83 KB / Downloads: 539)

Introduction
Abstract

As the world evolves to rely more and more on computers and data storage, data compression becomes an increasingly useful and important tool. Most data used by humans contains a high amount of redundancy. Compression algorithms aim to eliminate this redundancy, while still preserving all the information contained in the original data. The end result is a reduction in storage requirements, in exchange for increased computation.
One of the most well-known and fundamental compression algorithms is the technique known as Huffman Coding, invented in 1952 by David Huffman. It forms the basis for innumerous theoretical and real-world compression schemes, and serves to prove several important properties of compression in general. Unfortunately, Huffman Coding has shortcomings. It always requires an integral number of bits to represent a given symbol; as a result, inefficiencies arise when the optimal number of bits would be fractional. A technique known as Arithmetic Coding has gained prominence as a replacement for Huffman Coding. It avoids the pitfalls of integral bit counts by representing the compressed data conceptually as a single floating-point number.
This paper aims to present and analyze both algorithms.

Overview of compression’s purpose and history

Compression is used in a wide-ranging variety of applications, from the transmission of science data collected on board NASA space probes, to the storage of digital music on personal computers. Almost as long as there has been digital data, there has been compression of that data.
The basic aim of compression is to sacrifice time and/or processing power in exchange for a reduction in storage requirements. This paper concerns itself in particular with lossless compression, which requires that the compressed data (when uncompressed) restores to an exact copy of the original.

Summary of significant principles in information theory

It is worth noting that there are very strict limits to lossless compression. In particular, no compressor is capable of compressing all data. If it compresses some input streams into a smaller output stream version, it is mathematically provable that there must exist other input streams which actually get larger when fed through the compressor!
Thus lossless compression schemes attempt to compress “typical” data streams, which are a particular subset of all possible input streams. Just what constitutes “typical” depends on your application… thus the abundance of different compression methods.

Huffman Coding
Overview

Huffman Coding is a variable-length character encoding which uses the frequency of symbols in an input string to determine the optimal number of bits for representing each character in the output. It requires an initial scan of the entire input stream to determine the input symbol frequency distribution. This stage builds a tree to represent the optimal unique prefix code of each character. The actual compression is then performed by simply applying the translation given by the prefix code tree.

Precise algorithm description

There are two stages to the static Huffman Coding algorithm. The first is an analysis state to determine the coding scheme, the second is the actual encoding stage.
The first step to determining the coding scheme is to make a single pass over the entire input stream, recording the frequency of each input symbol encountered. Each symbol is then formed into a trivial tree with one node. The number of times a particular symbol occurs in the input stream will be referred to as its tree’s weight. One must then perform the following iterative algorithm:

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Adaptive Wavelet Compression	nit_cal	1	15,294,127	14-09-2017, 12:34 PM Last Post: jaseela123
	COMPUTER FORENSIC ANALYSIS PPT	seminar projects maker	1	815	31-08-2017, 04:53 PM Last Post: jaseela123
	COMPARISON OF DIFFERENT TYPES OF SCHEDULING ALGORITHM IN WIRE NETWORK	seminar surveyer	1	4,508,586	30-08-2017, 11:02 AM Last Post: jaseela123
	OBJECT ORIENTED ANALYSIS AND DESIGN TWO MARK AND SIXTEEN MARK Q and A	seminar ideas	1	1,982	29-08-2017, 11:23 AM Last Post: jaseela123
	Analysis of algorithms	electronics seminars	0	10,480,074	25-08-2017, 09:32 PM Last Post: electronics seminars
	MPEG Video Compression	computer science crazy	0	21,302,197	25-08-2017, 09:32 PM Last Post: computer science crazy
	XML Data Compression	computer science crazy	0	12,739,390	25-08-2017, 09:32 PM Last Post: computer science crazy
	Comparison of operating system kernels	computer science crazy	0	7,145,947	25-08-2017, 09:32 PM Last Post: computer science crazy
	Atomicity Analysis of Multi-threaded Programs	nit_cal	0	10,417,933	25-08-2017, 09:32 PM Last Post: nit_cal

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.