19-04-2012, 01:03 PM
Measures for Classification and Detection in Steganalysis
ME_thesis.pdf (Size: 471.13 KB / Downloads: 48)
Introduction
Steganography is a Greek word meaning covered or hidden writing. It is the art and
science of secret communication, aiming to conceal the existence of the communication.
This is a different from Cryptography, where the existence of the communication is not
disguised but the message is obscured by scrambling it. Use of cryptography would not
stop a third party knowing that some secret communication is going on. In steganography,
the message to be sent is concealed in such a way that an intruder would not know whether
any secret communication is going on or not. Hiding information inside digital carriers is
becoming popular( [1, 13]). A rapid growth in demand and consumption of multimedia
has resulted in data hiding techniques for files like audio (.wav), images (.bmp, .pnm,
.jpg). Digital images are most common sources for hiding message. The process of hiding
information is called an embedding. Least Significant Bit (LSB) embedding is the most
widely used steganographic technique. In LSB embedding, the LSBs of uncompressed
images are replaced with the message bits. We will be seeing this in detail in 2.3.2. The
amount of embedding (the number of bits embedded) referred to as level, is given as the
percentage of the total number of pixels.
Image Steganography
We will see how information can be camouflaged in images. First we will see what are
digital images.
Digital Images
To store an image on computer, it is divided into small parts called pixels. The value of
intensity of these pixels is stored for three basic colors as an image on computer. ‘.bmp’,
‘.pnm’ are some file formats to store such images, which are uncompressed file formats
for images. These images have a lot of redundancy. Also the loss of small information in
pixel intensity is not captured by the human eye. So there exists compression techniques
like jpeg for images. The compression techniques will try to de-correlate the redundancy
and may also introduce some loss of information.
LSB Steganography for Uncompressed images
One simple and yet effective method of steganography is LSB replacement. As men-
tioned, small perturbation in pixel intensity is not detected by an eye; these techniques
take advantage of it by changing the LSB of a few pixels. The algorithm used will decide
which pixels in an image to be modified. Some algorithm will pick the pixels in image
at regular interval depending upon image size and message size. Sophisticated stegano-
graphic software viz. S-Tools [13], CSA-Tool [18] can add further layers of complexities,
such as distributing messages in a pseudo-random way and encrypting messages.
Quantization Index Modulation (QIM) Steganography
A message can be embedded in the host medium through the choice of a scalar quantizer.
For example, consider a uniform quantizer of step size , used on the host’s coefficients in
some transform domain. Let odd reconstruction points represent a signature data bit ‘1’.
Likewise, even multiples of ‘’ is used to embed ‘0’. Thus, depending on the bit value to
be embedded, one of the two uniform quantizers of step size 2 ∗ is chosen. Moreover, the
quantizers can be pseudo randomly dithered, where the chosen quantizers are shifted by a
pseudo-random sequence available only to encoder and decoder.
Classification based on statistical measures and SVM
Image steganography is a kind of transformation of a cover image and embedded data.
The embedding operation will perturb the statistical properties. We try to capture the
perturbation. We use statistics defined below as feature of non-random strings. As a
first step we establish the power of our feature vector of measures based on statistical
properties of bit strings in discriminating a variety of standard file types (Section 4.2).
Then we explore the possibility of discriminating images with different levels of embed-
dings. (Section 4.3). Once the level of embedding is determined to reasonable accuracy,
we can proceed to the next step of location of embedded bits by other statistical and
combinatorial techniques. For classification we use Support Vector Machines.
Conclusions and Future Work
We discussed two new approaches towards analysis of stego images for detection of levels
of embedding. Our approach of using wavelet coefficient perturbations holds promise.
We will also consider a modified wavelet coefficient based measure that takes into account
the numerical changes in the pixel values introduced by embedding. We plan to use
this measure in addition to the statistical measures to arrive at finer detection. We
suggest, similar to Expectation Maximization (EM) algorithm used by Machine learning
community, for finer detection of level of embedding, the use of two phase approach,
1. First, find some rough estimate of ‘p’ (defined in Eq. 5.1) using statistical measures.
2. Then using , Eq. 5.1, p′ estimate, k. Use this k to refine p and iterated until not
much change.