13-08-2012, 03:42 PM
Hide and Seek: An Introduction to Steganography
practical.pdf (Size: 1.09 MB / Downloads: 22)
Steganography is the art and science of hiding
communication; a steganographic system
thus embeds hidden content in unremarkable
cover media so as not to arouse an eavesdropper’s
suspicion. In the past, people used hidden
tattoos or invisible ink to convey steganographic
content. Today, computer and network technologies
provide easy-to-use communication channels for
steganography.
Essentially, the information-hiding process in a
steganographic system starts by identifying a cover
medium’s redundant bits (those that can be modified
without destroying that medium’s integrity).1 The
embedding process creates a stego medium by replacing
these redundant bits with data from the hidden
message.
The basics of embedding
Three different aspects in information-hiding systems
contend with each other: capacity, security, and robustness.
4 Capacity refers to the amount of information that
can be hidden in the cover medium, security to an eavesdropper’s
inability to detect hidden information, and robustness
to the amount of modification the stego
medium can withstand before an adversary can destroy
hidden information.
Information hiding generally relates to both watermarking
and steganography. A watermarking system’s
primary goal is to achieve a high level of robustness—that
is, it should be impossible to remove a watermark without
degrading the data object’s quality. Steganography, on
the other hand, strives for high security and capacity,
which often entails that the hidden information is fragile.
Even trivial modifications to the stego medium can destroy
it.
Hide and seek
Although steganography is applicable to all data objects
that contain redundancy, in this article, we consider
JPEG images only (although the techniques and methods
for steganography and steganalysis that we present
here apply to other data formats as well). People often
transmit digital pictures over email and other Internet
communication, and JPEG is one of the most common.
Sequential
Derek Upham’s JSteg was the first publicly available
steganographic system for JPEG images. Its embedding
algorithm sequentially replaces the least-significant bit of
DCT coefficients with the message’s data (see Figure
3).13 The algorithm does not require a shared secret; as a
result, anyone who knows the steganographic system can
retrieve the message hidden by JSteg.
Andreas Westfeld and Andreas Pfitzmann noticed that
steganographic systems that change least-significant bits
sequentially cause distortions detectable by steganalysis.8
They observed that for a given image, the embedding of
high-entropy data (often due to encryption) changed the
histogram of color frequencies in a predictable way.
Subtraction
Steganalysis successfully detects steganographic systems
that replace the least-significant bits of DCT coefficients.
Let’s turn now to Andreas Westfeld’s steganographic system,
F5.17
Instead of replacing the least-significant bit of a DCT
coefficient with message data, F5 decrements its absolute
value in a process called matrix encoding. As a result, there is
no coupling of any fixed pair of DCT coefficients, meaning
the χ2-test cannot detect F5.
Matrix encoding computes an appropriate (1, (2k – 1), k)
Hamming code by calculating the message block size k
from the message length and the number of nonzero non-
DC coefficients. The Hamming code (1, 2k– 1, k) encodes
a k-bit message word m into an n-bit code word a with
n = 2k – 1. It can recover from a single bit error in the code
word.18
Statistics-aware embedding
So far, we have presented embedding algorithms that
overwrite image data without directly considering the distortions
that the embedding caused. Let’s look at a framework
for an embedding algorithm that uses global image
statistics to influence how coefficients should be changed.
To embed a single bit, we can either increment or
decrement a DCT coefficient’s value. This lets us change
a DCT coefficient’s least-significant bit in two different
ways. Additionally, we create groups of DCT coefficients
and use the parity1 of their least-significant bits as message
bits to further increase the number of ways to embed a
single bit. For every DCT block, we search the space of all
possible changes to find a configuration that minimizes
the change to image statistics. Currently, we search for solutions
that maintain the blockiness, the block variance,
and the coefficient histogram.