24-08-2013, 03:55 PM
Image segmentation based on the normalized cut framework
Image segmentation.pdf (Size: 1.48 MB / Downloads: 132)
Motivation
Image segmentation is an important image processing, and it seems everywhere if
we want to analyze what inside the image. For example, if we seek to find if there is a
chair or person inside an indoor image, we may need image segmentation to separate
objects and analyze each object individually to check what it is. Image segmentation
usually serves as the pre-processing before image pattern recognition, image feature
extraction and image compression. Researches of it started around 1970, while there
is still no robust solution, so we want to find the reason and see what we can do to
improve it.
Our final project title is a little bit different from the proposal. The title of the
proposal is “Photo Labeling Based on Texture Feature and Image Segmentation”,
while during the execution, we change it into ”Image segmentation based on the
normalized cut framework”. The main reason is that we found there are many kinds of
existed image segmentation techniques and methods, in order to gain enough
background, we went through several surveys and decided to change the title into a
deep view of image segmentation.
Introduction
Image segmentation is used to separate an image into several “meaningful” parts. It
is an old research topic, which started around 1970, but there is still no robust solution
toward it. There are two main reasons, the first is that the content variety of images is
too large, and the second one is that there is no benchmark standard to judge the
performance. For example, in figure 1.1, we show an original image and two
segmented images based on different kinds of image segmentation methods. The one
of figure 1.1 (b) separates the sky into several parts while the figure 1.1 © misses
some detail in the original image. Every technique has its own advantages also
disadvantages, so it’s hard to tell which one is better.
Normalized cut framework
The normalized cut framework is proposed by J. Malik and J. Shi [8]. In their
opinion, the image segmentation problem can be seen as a graph theory problem.
Graph theory is an interesting math topic which models math problems into arcs
(edges) and nodes. Although it’s hard to explain graph theory in this project report, we
give two practical examples to give readers more ideas about what it can do. In figure
2.1, a graph model of Taiwan map is presented, where we models each county as a
node, and the edge between two nodes means these two counties are connected in the
original map. This model could be used for coloring problems (give each county a
color, while connected county should have different colors), or transportation flow
problems in advanced. Each edge in the model could contain a value (weight), which
could be used as flow or importance of it. This kind of graph is called “weighted
graph”, and is frequently adopted by internet researchers.
Summary of the algorithms
Figure 2.3 is the flowchart of the normalized cut framework. First we model the
image into a graph and get the matrices W and D. Then we solve equation (2.6) and
(2.5) to get the rebuilt y and separate the image into two segments. The normalizedcut could only separate a segment into two parts in each iteration, so we need to
recursively do the separation process to each segment. There is a diamond-shape
block in the flowchart which serves the stopping mechanism of the recursive
operation. For each segment, we check its area (number of pixels inside) and the
further minimum normalized cut value, if the area is smaller than a defined threshold
or the further minimum normalized cut value is larger than another defined threshold,
the further separation process for this segment stops. The W and D for each segment
could directly be extracted from the original W, so we don’t have to rebuild it at each
iteration. With this flowchart, we could solve the minimum normalized cut problem
by Matlab programs and implement the image segmentation operation.
Texton and feature representation
The K centroids after the K-means clustering operation in the before
subsection are called the K textons of the image. We compute the texton-index
histogram in a sliding window (containing several pixels inside) around each
pixel as the final feature representation of this pixel. So finally we have a
histogram for each pixel in the image with each bin as the occurring probability
of each texton inside the sliding window, and the number of bins of the
histogram is equal to the number of groups after the K-means clustering
operation. Figure 3.19 shows the representation method.
Discussion and conclusion
From our experiments, we see that our methods could achieve the image
segmentation purpose. For simple images with just a little texture inside, the result is
quite good, and this method could also performs well on natural and landscape images.
The above tests are based on the luminance or RGB based similarity measurement.
When we test our algorithm on the texture images with the same similarity
measurement, we found the result become worse. And the main reason is that the
pixel value variance is really large in the texture region, which means pixels in the
same texture region may have small similarities and are segmented into different
groups.