Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: An Efficient Compression Algorithm (ECA) for Text Data
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Abstract—Data compression algorithms are used to reduce the
redundancy and storage requirement for data. Data compression
is also an efficient approach to reduce communication costs by
using available bandwidth effectively. Over the last decade we
have seen an unprecedented explosion in the amount of digital
data transmitted via the Internet in the form of text, images,
video, sound, computer programs, etc. If this trend expected to
continue, then it will be necessary to develop a compression
algorithm that can most effectively use available network
bandwidth by compressing the data at maximum level. Along
with this it will also important to consider the security aspects of
the compressed data transmitting over Internet, as most of the
text data transmitted over the Internet is very much vulnerable
to an attack. So, we are presenting an intelligent, reversible
transformation technique that can be applied to source text that
improve algorithm ability to compress and also offer a sufficient
level of security to the transmitted data.
Index Terms—ECA, Data Compression, Dictionary Based
Encoding, Lossless Compression.
I. INTRODUCTION
ver the last decade we have seen an unprecedented
explosion in the amount of text data transmitted via the
Internet in the form of digital library, search engines etc. The
text data competes for 45% of the total Internet traffic. So, for
reducing the Internet traffic data need to be compressed, so
that large amount of information can be transmitted. A
number of sophisticated algorithms have been proposed for
lossless text compression such as Huffman encoding,
arithmetic encoding, the Lempel-Ziv (LZ) family, Dynamic
Markov Compression (DMC), Prediction by Partial Matching
(PPM), and Burrows-Wheeler Transform (BWT) based
algorithms etc. However, none of the above algorithms has
been able to achieve best-case compression ratio.
Michael Burrows and David Wheeler have given BWT
transformation function that opens the door to some
revolutionary new data compression techniques. The BWT is
performed on an entire block of data at one and transforms
into a format that is extremely well suited for compression.
The block sorting algorithm they developed works by
applying a reversible transformation to a block of input text.
The transformation does not itself compress the data, but
reorders it to make it easy to compress with simple algorithms
such as move to front encoding. The basic philosophy of
secure compression is to preprocess the text and transform it
into some intermediate form which can be compressed with
better efficiency and which exploits the natural redundancy of
the language in making the transformation.
Most of today's familiar lossless compression algorithms
operate in streaming mode, reading a single byte or a few
bytes at a time. But with this new transform, we want to
operate on the largest chunks of data possible. Since the BWT
operates on data in memory, you may encounter files too big
to process in one fell swoop. In these cases, the file must be
split up and processed a block at a time. The output of the
BWT transform is usually piped through a move-to-front
stage, then a run length encoder stage, and finally an entropy
encoder, normally arithmetic or Huffman coding. The actual
command line to perform this sequence will look like this: