26-08-2014, 10:30 AM
Ant Colony Optimization and a New Particle Swarm
Optimization algorithm for Classification of
Microcalcifications in Mammograms
Ant Colony.pdf (Size: 253.66 KB / Downloads: 61)
Abstract—
Genetic Algorithm (GA), Ant Colony
Optimization (ACO) algorithm and Particle Swarm
Optimization (PSO) are proposed for feature selection, and
their performance is compared. The Spatial Gray Level
Dependence Method (SGLDM) is used for feature extraction.
The selected features are fed to a three-layer Backpropagation
Network hybrid with Ant Colony Optimization and Particle
Swarm Optimization (BPN-ACO-PSO) for classification. And
the Receiver Operating Characteristic (ROC) analysis is
performed to evaluate the performance of the feature selection
methods with their classification results. The proposed
algorithms are tested with 114 abnormal images from the
Mammography Image Analysis Society (MIAS) database
I. INTRODUCTION
In many western countries breast cancer is the most
common form of cancer among women. The World Health
Organization’s International agency for Research on Cancer,
estimates that more than 250 000 women worldwide die of
breast cancer each year. The breast cancer is one among the
top three cancers in American women. In United States, the
American Cancer Society estimates that, 315 990 new cases of
breast carcinoma has been diagnosed, in 2007. It is the leading
cause of death due to cancer in women under the age of 65.
Thangavel et al., [16] presented a good review on various
methods for detection of microcalcifications. It is of crucial
importance to design the classification method in such a way
to obtain a high level of True-Positive Fraction (TPF) while
maintaining the False-Positive Fraction (FPF) at its minimum
level. The texture features are extracted using Spatial Gray
Level Dependence Method (SGLDM) from the segmented
mammogram image [3,8]. In order to reduce the complexity
and to increase the performance of the classifier the redundant
and irrelevant features are reduced from the original feature
set. In this paper, GA, ACO and PSO algorithms are proposed
to select the optimal features from the original feature set.
Only the optimal features are inputted to the classifier for
classification of microcalcifications. The following section
presents an overview of the work.
II. SEGMENTATION OF MICROCALCIFICATIONS
Before extracting the texture features, microcalcifications
should be segmented from the background of the
mammographic images [6, 7, 17]. In this paper,
microcalcifications are segmented using Particle Swarm
Optimization (PSO) algorithm hybrid with Markov Random
Field (MRF). The segmentation process with this method
consists of three steps. The first step enhances the
mammogram image using median filtering.. In the second step,
the cliques having similar arrangements of pixels are assigned
with unique label. And the Maximizing a Posterior (MAP)
function value is estimated for each clique using MRF. Where,
III. FEATURE EXTRACTION
The texture of images refers to the appearance, structure and
arrangement of the parts of an object within the image. Images
used for diagnostic purposes in clinical practice are digital. A
two dimensional digital image is made up of little rectangular
blocks or pixels (picture elements) each is represented by a set
of coordinates in space, and each has a value, representing the
gray-level intensity of that picture element in space. A feature
value is a real number, which encodes some discriminatory
information about a property of an object. In this paper, the
Spatial Gray Level Dependence Method is used to extract the
features from the segmented mammogram image.
A. Spatial Gray Level Dependence Method (SGLDM)
In this method, co-occurrence matrix is generated to extract
the texture features from the segmented mammogram image.
There may be many co-occurrence matrices computed for a
single image, one for each pair of distances and directions
defined. Normally a set of 20 co-occurrence matrices are
computed, for five different distances, in the horizontal,
vertical, and two diagonal directions i.e., the distances are 1, 3,
5, 7 and 9, and the four angles 0o
, 45o
, 90o
, and 135o
were
defined for calculating the matrix for each of the five
IV.FEATURE SELECTION
Feature selection is meant here to refer to the problem of
dimensionality reduction of data, which initially contain a
high number of features. One hope to choose optimal subsets
of the original features which still contain the information
essential for the classification task, while reducing the
computational burden imposed by using many features [5, 8,
10]. In this paper, the Genetic Algorithm (GA), Ant Colony
Optimization (ACO) algorithms and Particle Swarm
Optimization (PSO) are proposed for feature selection.
A. Feature Selection Using Genetic Algorithm
A GA is a heuristic search or optimization technique
for obtaining the best possible solution in a vast solution
space[2, 11]. In this paper, totally 20 co-occurrence matrices
are created for each image for each pair of distances and
directions defined. The Haralick features are extracted for all
the 114 images. And the features are grouped into four
categories as discussed in the earlier section. A single feature
value for all the images are considered as initial population for
genetic algorithm.
B. Feature Selection Using Ant Colony Optimization
Algorithm
The optimum feature is selected from each group and
only those selected features are further used in the
classification. As a result the ASM, IDM, ENT and IMC2 are
the selected features from this ACO algorithm.
C. Feature Selection Using Particle Swarm Optimization
(PSO)
The optimum feature is selected from each group and
only those selected features are further used in the
classification. As a result the ASM, IDM, ENT and IMC1 are
the selected features from this algorithm.
V. CLASSIFICATION
Classification of objects is an important area of
research and of practical applications in a variety of fields,
including pattern recognition, artificial intelligence and vision
analysis. Classifier design can be performed with labeled or
unlabeled data. The Back Propagation learning algorithm is
widely used for multi-layer feed forward network. The
classifier employed in this paper is a three layer Back
Propagation Neural network. The Back Propagation Neural
network optimizes the net for correct responses to the training
input data set. More than one hidden layer may be beneficial
for some applications, but one hidden layer is sufficient if
enough hidden neurons are used.
A. Back Propagation Network Classifier Hybrid with Ant
Colony Optimization Algorithm
Back propagation is a learning algorithm for multi-layered
feed forward networks that uses the sigmoid function. The
backpropagation neural network optimizes the net for correct
responses to the training input data set. More than one hidden
layer may be beneficial for some applications, but one hidden
layer is sufficient if enough hidden neurons are used [4, 8].
In the back propagation algorithm error function is
calculated after the presentation of each input and the error is
propagated back through the network modifying the weights
before the presentation of the next pattern. This error function
is usually the Mean Square Error (MSE) of the difference
between the desired and the actual responses of the network
over all the output units. Then the new weights remain fixed
and a new image is presented to the network and this process
continuous until all the images have been presented to the
network. The presentation of all the patterns is usually called
one epoch or single iteration. In practice many epochs are
needed before the error becomes acceptably small. The
number of hidden neurons is equal to the number of input
neurons. And only one output neuron. Initial weights are
extracted using the ACO algorithm as follows
VI. RECEIVER OPERATING CHARACTERISTIC (ROC)
ANALYSIS
The receiver operating characteristic (ROC) curve is a
popular tool in medical and imaging research. It conveniently
displays diagnostic accuracy expressed in terms of sensitivity
(or true-positive rate) against (1-specificity) (or false-positive
rate) at all possible threshold values. Performance of each test
is characterized in terms of its ability to identify true positives
while rejecting false positives [14, 15].
VII. RESULTS AND DISCUSSION
The images used in this work were taken from the
Mammography Image Analysis Society (MIAS) (2003). The
database consisting of 322 images, which belong to three normal categories: normal, benign and malign. There are 208
normal images, 63 benign and 51 malign. In this paper, only
the benign and malign images are considered for feature
extraction. All the images also specify the locations of any
abnormities that may be present.. The classification results of
the back propagation neural network is tested by using a jack knife method, round-robin method, and ten fold validation
method. The results were analyzed by using ROC curve.
Table 3: Classification based on the training and feature
selection algorithms
VIII. CONCLUSION
In this paper, SGLDM is used to extract the Haralick features
from the segmented mammogram image. And the features are
grouped into four categories based on visual texture
characteristics, statistics, information theory and information
measures of correlation. Genetic Algorithm, Ant Colony
Optimization algorithms and Particle Swarm Optimization are
proposed for feature selection. Each algorithm is selecting the
optimum feature from each group and the selected features are
considered for classification. A three-layer Back propagation
Neural Network hybrid with Ant Colony Optimization
algorithm and Particle Swarm Optimization is used for
classification. The ACO and PSO algorithm is used for weight
extraction while learning. ROC analysis is performed to
compare the classification results of the feature selection
algorithms. The results show that the PSO algorithm selects
better features than GA and ACO