01-01-2013, 01:53 PM
Information Theory and Coding
Information, entropy
Summary:
Redundancy and relative redundancy of the source
High redundancy of a source is a hint that compression methods will
be beneficial.
E.g., Fax transmission:
~90% white pixels
low entropy (as compared to the “best” source)
high redundancy of the source
redundancy is lowered by run length encoding
Information, entropy
Extended (memoryless) sources:
Combine n primary symbols from
to a block of symbols (secondary symbols from )
Source Coding
Source coding theorem (Shannon)
Efficient representation (Coding) of data from a discrete source
Depends on the statistics of the source
short code words for frequent symbols
long code words for rare symbols
Code words must uniquely decodable
Important group of prefix codes
Each symbol gets a code word assigned that approximately matches
it’s infomation
Huffman coding algorithm:
1. Sort symbols with decreasing probabilities. Assign “0” and “1” to the
symbols with the two lowest probabilities
2. Both symbols are combined to a new symbol with the sum of the
probabilities. Resort the symbols again with decreasing probabilities.
3. Repeat until the code tree is complete
4. Read out the code words from the back of the tree