31-10-2016, 02:56 PM
1462522581-LungTumorDetectionandSegmentationinCTImagethesis.docx (Size: 601.76 KB / Downloads: 6)
Introduction
Lung cancer is most serious cancers in the world. It is the second most commonly diagnosed cancer in both men and women. The human lungs are the organs of respiration in humans. The main function of the lungs is to allow oxygen from the air to enter the bloodstream for delivery to the rest of the body. The lung cancer is abnormality in body cells which is called tumor. Tumors can be benign or malignant. A tumor which can be removed and can be stopped further spreading in other part of body, is type of benign tumor. Tumor which grows rapidly and cannot stopped further spreading in to ot her part of body is known as malignant. Medical Imaging becomes very effective for tumor detection and treatment of lung cancer.In medical imaging different type of images are used for detection and diagnosis of lung cancer. Computed Tomography (CT) images are preferred because of better clarity, low noise and less distortion. The main advantages of CT are its high resolution and short scanning time that improves the reliability of treatment. The main purpose of image processing technique is to find the early stage of lung cancer. For the proposed work, the CT images are obtained from NIH/NCI Lung image Database Consortium (LIDC) which provides the opportunity to do the research.
Background
In India, lung cancer was initially thought to be extremely rare. Few attempts were made to know the exact frequency. Lung cancer constituted 14.4% of all cancers in a review of 9,210 consecutive autopsies by Banker in 1957. Sirsat (1958) reported that lung cancer formed one per cent of all cancers in Tata Cancer Hospital .Viswanathan and Sengupta (1961) collected sporadic information from different hospitals of the country and found that the incidence of lung cancer in hospital population was 27.4 per million in 1950, and in 1959 it rose to 78.6 per million. They also found an increase in the incidence of bronchogenic carcinoma following analysis of the records of 15 teaching institutions in India over a period of 10 years. From 16.1 in 1950, it had increased to 26.9 in 1961 per 1,000 malignancies. According to Wig et al (1961), lung carcinoma is a frequent finding amongst all the chest diseases. The survey conducted in Uttar Pradesh in 1966 by Misra and others showed that the incidence was 4.2 per 10,000 hospital admissions and 2.1 per cent of all malignancies. The National Cancer Registry Programme of the Indian Council of Medical Research, which collected data from six different parts of the country, both rural and urban areas, showed varying figures at different areas10. While cancer of the trachea, bronchus and lungs were the most common types of malignancies in males in 1989 from Bombay, Delhi, and Bhopal registries, it was the second most common in Madras, and third in Bangalore, and was most unusual in Barshi, a rural area. The disease was most uncommon in females and only in Bombay it was the sixth common malignancy, and in Bhopal it was the seventh in rank. Hospital data from different parts of the country also showed different patterns. Behera and Kashyap analysed the pattern of malignancies in patients admitted in PGIMER, Chandigarh from 1973 to 198211. They found that of the 2,23,930 hospital admissions there were 863 lung cancer cases (0.38%). Lung cancer was the fifth common cancer after lympho-reticular malignancy, carcinoma cervix, oropharyngeal cancer and carcinoma of breast. The total number of lung cancer admissions steadily rose from 1973. As of 1st July 2002, a total of 41,000 lung cancer cases would have been diagnosed as per data from the ICMR Cancer Registry. Males predominate with a M:F ratio of 4.5:1 and this ratio varies with age and smoking status. The ratio increases progressively upto 51 - 60 years and then remains the same. The smoker to non-smoker ratio is high upto 20:1 in various studies. Upto 40 years of age, small-cell type predominates and has less association with smoking. After the age of 40 years, squamous cell type is commonest in smokers and adenocarcinoma in non-smokers. The demographic pattern of lung cancer in India is similar to that of Western countries 40 years ago.
Deaths due to lung cancer are more than those due to colorectal, breast and prostate cancers put together. Incidence and death from lung cancer in females is rising while it is declining in males in developed countries. This is the single most damaging cause of cancer-related deaths with approximately 1.5 million cases world-wide and more than 1.3 million cancer-related deaths in 2001. The five-year survival rate for lung cancer has improved only marginally from 5% in the late 1950s to 14% by 1994. This is in contrast to the five years survival rate of 52% for some other cancers. Lung cancer is responsible for about one million deaths per year at present and it will rise to three millions per year by the year 2010.Lung cancer is among the fastest growing cancers. Every year 1.8 million people fall prey to it, and 1.6 million die due to it. In India, the number of new cases increased from around 65,000 in 2009 to 90,000 in 2013, registering 15-20% increase annually. An estimated 224,390 new cases of lung cancer are expected in 2016, accounting for about 14% of all cancer diagnoses and an estimated 158,080 deaths are expected to occur in 2016,accounting for about 1 in 4 cancer deaths in United States.
1.2 Problems and challenges of Lung image segmentation
The goal of this thesis is to implement a Lung segmentation using Marker Controlled Watershed Algorithm. Lung segmentation is a very challenging area attracting much research activity. Problems encountered are due to the inherent complexity contained in the images, the considerable amount of normal variation across patients, and problems due to the imaging modality employed itself. Detecting the lung nodules in chest x-ray images and computed tomography (CT) is an important step of computer-aided diagnosis applications such as tuberculosis or pneumoconiosis screening. Processing of x-ray chest images and Computed Tomography (CT) images poses some challenges. In current medical diagnosis, treatment, and surgery, medical imaging plays one of the most important roles, since imaging devices such as X-Ray and Computed Tomography (CT), yield a great deal of information about diseases and organs. However, radiologists have to analyze and evaluate a number of medical images comprehensively in a short time, which is a large burden. For example, for lung segmentation, the strong edges at the rib cage and clavicle region cause local minima for most minimization approaches. A lung segmentation using Marker Controlled Watershed Algorithm had been proposed to overcome the limitations of the existing lung segmentation methods. This study would be significant in the following ways:
1. Enhanced Accuracy,
2. Less Time Consuming, and
3. Lower Computational cost
1.3 Scopes and objectives of the thesis
Various medical imaging techniques such as magnetic resonance imaging (MRI), computed tomography (CT) provide different perspectives on the human lung. Many segmentation techniques such as mean shift, region growing, watershed, fuzzy connectivity etc.are available for medical imaging especially for lung CT. Marker Controlled Watershed is one of the leading segmentation technique to efficiently solve a wide variety of lung tumor identification. Energy minimization along with image smoothing is performed with the filtering method.The aim of this thesis is to perform lung CT image segmentation applying the marker controlledalgorithm of thesholding technique. In this regard, MATLAB simulations with the mentioned algorithm will be conducted to implement in thresholding technique.
1.4 Outline
The thesis is organized as follows. Chapter 1 provides the introductory part and background information of the thesis topic as well as research scopes and goals identification. Literature review of various segmentation techniques and different kinds of algorithms are discussed in Chapter 2. Theoretical analysis of Thesholding technique, problem formulation and methodological is depicted in Chapter 3 which is the foundation of this thesis work. Simulation results with MATLAB software applying the algorithm are presented in Chapter 4. Detailed discussions and quantitative evaluation of the results are explored and analyzed in Chapter 5. Finally conclusions have been made in Chapter 6 along with recommendations for future research work.
2. Literature Survey
All researchers have aim to develop such a system which predict and detect and segment the tumor in its early stages. Also tried to improve the accuracy of the Early Detection system by preprocessing, segmentation feature extraction and classification techniques of extracted database. The major contributions of the research are summarized below
Paper 1
In this paper, Jia Tong has explained the general pre-processing and enhancement technique. He use ruled-based classification approach for extraction of multiple nodules. Techniques employed are adaptive thresholding, region growing operation [2].
Paper 2
In this paper, Anjali Kulkarni has explained some methods of segmentation techniques which are being classified here. Area of interest, classification, size and shape of nodule, perimeter and eccentricity are some of the considered features to be extracted [1].
Paper 3
In this paper, One more method of automatic detection of lung nodule is explained by Mokhled S. For enhancement technique gabor filter gives better results and segmentation technique used Controlled watershed segmentation approach to separate the region
Methodology
In the image processing we are proposing an effective methodology to detect tumor in the lungs. In this proposed approach we have applied a series of operations, first to enhance the image and then to detect the tumor from the lung image. After enhancing the image we filter the image and then thresholded the image and after that we applied a segmentation algorithm which segment the image’s desired part, in this case the tumor which we wants to detect.
. Database
The first stage is to use lung CT images of cancer patients. The CT images are low noise as compare to X-ray and MRI images. The CT images LIDC/IDRI contained from Lung Image Database Consortium. The lung CT images are in the format of DICOM (Digital Image Communications in Medicine).In this paper we included lung CT images of 5 patients from LIDC Database. Figure 2 shows the sample CT image of one lung Cancer patient. Image having 2 lung nodule with 1 Tumor present in left lung nodule
5. Image Enhancement
Image enhancement problem can be formulated as follows: given an input low quality image and the output high quality image for specific applications. It is well-known that image enhancement as an active topic in medical imaging has received much attention in recent years. The aim is to improve the visual appearance of the image, or to provide a “better” transform representation for future automated image processing, such as analysis, detection, segmentation and recognition. Moreover, it helps analyses background information that is essential to understand object behaviour without requiring expensive human visual inspection. Carrying out image enhancement understanding under low quality image is a challenging problem because of these reasons. Due to low contrast, we cannot clearly extract objects from the dark background. Most colour based methods will fail on this matter if the colour of the objects and that of the background are similar. The survey of available techniques is based on the existing techniques of image enhancement, which can be classified into two broad categories: Spatial based domain image enhancement and Frequency based domain image enhancement. Spatial based domain image enhancement operates directly on pixels. The main advantage of spatial based domain technique is that they conceptually simple to understand and the complexity of these techniques is low which favours real time implementations. But these techniques generally lacks in providing adequate robustness and imperceptibility requirements. Frequency based domain image enhancement is a term used to describe the analysis of mathematical functions or signals with respect to frequency and operate directly on the transform coefficients of the image, such as Fourier transform, discrete wavelet transform (DWT), and discrete cosine transform (DCT). The basic idea in using this technique is to enhance the image by manipulating the transform coefficients. The advantages of frequency based image enhancement includes low complexity of computations, ease of viewing and manipulating the frequency composition of the image and the easy applicability of specialtransformed domain properties. The basic limitations including are it cannot simultaneously enhance all parts of image very well and it is also difficult to automate the image enhancement procedure. In this paper according to if enhanced image embed high quality background information, the existing techniques of image enhancement like spatial domain methods can again be classified into two broad categories: Point Processing operation and Spatial filter operations. Traditional methods of image enhancement are to enhance the low quality image itself. It doesn’t embed any high quality background information. The reason is that in the dark image, some areas are so dark that all the information is already lost in those regions. No matter how much illumination enhancement you apply, it will not be able to bring back lost information. Frequency domain methods can again be classified into three categories: Image Smoothing, Image Sharpening, Periodic Noise reduction by frequency domain filtering. In this paper we focus on image enhancement considering areas of spatial domain enhancement techniques.
SPATIAL DOMAIN METHODS
Spatial domain techniques directly deal with the image pixels. The pixel values are manipulated to achieve desired enhancement. Spatial domain techniques like the logarithmic transforms, power law transforms, histogram equalization are based on the direct manipulation of the pixels in the image. Spatial techniques are particularly useful for directly altering the gray level values of individual pixels and hence the overall contrast of the entire image. But they usually enhance the whole image in a uniform manner which in many cases produces undesirable results. It is not possible to selectively enhance edges or other required information effectively. Techniques like histogram equalization are effective in many images. The approaches can be classified into two categories: Point Processing operation (Intensity transformation function) and Spatial filter operations. An overview of some of the well known methods is discussed here. Point processing operations (Intensity transformation function) is the simplest spatial domain operation as operations are performed on single pixel only. Pixel values of the processed image depend on pixel values of original image. It can be given by the expression g(x,y) = T[f(x,y)] , where T is gray level transformation in point processing. The Point processing approaches can be classified into four categories as Image Negatives in which gray level values of the pixels in an image are inverted to get its negative image. Consider a 8 bit digital image of size M x N, then each pixel value from original image is subtracted from 255 as g (x, y)=255- f(x, y) for 0 ≤ x < M and 0 ≤ x < N. In a normalized gray scale, s = 1.0 – r. Negative images are useful for enhancing white or gray detail embedded in dark regions of an image.
Another technique is Image Thresholding transformation in which let r th be a threshold value in f(x, y). Image thresholding can be achieved as in a normalized gray scale As pixel values of threshold image are either 0’s or 1’s, g(x, y) is also named as binary image. These are particularly useful in image segmentation to isolate an image of interest from back ground. Lung image can be isolated from back ground in binary image as shown in Fig 2.
Next kind of transformation is the Log transformation which maps a narrow range of low gray levels into a wider range of gray levels i.e. expand values of bright pixels and compress values of dark pixels. If C is the scaling factor, then log transformation can be achieved as s = C log (1+│r│).
Next kind of transformation is histogram equalization is used for contrast adjustment using the image histogram When ROI is represented by close contrast values, this histogram equalization enhances the image by increasing the global contrast. As a result, the intensities are well scattered on the histogram and low contrast region is converted to region with higher contrast. This is achieved by considering more frequently occurring intensity value and spreading it along the histogram. Histogram equalization plays a major role in images having both ROI and other region as either darker or brighter. It’s advantage is, it goes good with images having high color depth. For example images like 16-bit gray-scale images or continuous data. This technique is widely used in images that are over-exposed or under-exposed, scientific images like X-Ray images in medical diagnosis, remote sensing, and thermal images. Same way this technique has its own defects, like unrealistic illusions in photographs and undesirable effect in low color depth images.
Noise models
The main source of noise in digital images arises during image acquisition (digitization) or during image transmission. The performance of image sensor is affected by variety of reasons such as environmental condition during image acquisition or by the quality of the sensing element themselves. For instance, during acquiring images with CCD camera, sensor temperature and light levels are major factors that affecting the amount of noise in the image after the resulting. Images are corrupted while during transmission of images. The principal reason of noise is due to interfering in the channel which is used for the images transmission [3].We can model a noisy image as follows: Where A(x ,y) is the original image pixel value and B(x ,y) is the noise in the image and C(x ,y) is the resulting noise image.
Gaussian Noise or Amplifier Noise:
This noise has a probability density function [pdf] of the normal distribution. It is also known as Gaussian distribution. It is a major part of the read noise of an image sensor that is of the constant level of noise in the dark areas of the image.
Uniform Noise:
The uniform noise cause by quantizing the pixels of image to a number of distinct levels is known as quantization noise. It has approximately uniform distribution. In the uniform noise the level of the gray values of the noise are uniformly distributed across a specified range. Uniform noise can be used to generate any different type of noise distribution. This noise is often used to degrade images for the evaluation of image restoration algorithms. This noise provides the most neutral or unbiased noise.
Salt and Pepper Noise:
The salt-and-pepper noise are also called shot noise, impulse noise or spike noise that is usually caused by faulty memory locations,malfunctioning pixel elements in the camera sensors, or there can be timing errors in the process of digitization .In the salt and pepper noise there are only two possible values exists that is a and b and the probability of each is less than 0.2.If the numbers greater than this numbers the noise will swamp out image. For 8-bit image the typical value for 255 for salt-noise and pepper noise is 0 Reasons for Salt and Pepper Noise:
a. By memory cell failure.
b. By malfunctioning of camera’s sensor cells.
c. By synchronization errors in image digitizing or transmission.
Rayleigh Noise: Radar range and velocity images typically contain noise that can be modeled by the Rayleigh distribution.
Gamma Noise:
The noise can be obtained by the low-pass filtering of laser based images
Filtering
Filtering in an image processing is a basis function that is used to achieve many tasks such as noise reduction, interpolation, and re-sampling. Filtering image data is a standard process used in almost all image processing systems. The choice of filter is determined by the nature of the task performed by filter and behavior and type of the data. Filters are used to remove noise from digital image while keeping the details of image preserved is an necessary part of image processing. Filters can be described by different categories:-- Filtering without Detection: In this filtering there is a window mask which is moved across the observed image. This mask is usually of the size (2N+1)/2, in which N is a any positive integer. In this the centre element is the pixel of concern. When the mask is start moving from left top corner to the right bottom corner of the image, it perform some arithmetic operations without discriminating any pixel of image Detection followed by Filtering: This filtering involves two steps. In the first step it identify the noisy pixels of image and in second step it filters those pixels of image which contain noise. In this filtering also there is a mask which is moved across the image. It performs some arithmetic operations to detect the noisy pixels of image. Then the filtering operation is performed only on those pixels of image which are found to be noisy in the first step, keeping the non-noisy pixel of image intact. Hybrid Filtering: In hybrid filtering scheme, two or more filters are used to filter a corrupted location of a noisy image. The decision to apply a particular filter is based on the noise level of noisy image at the test pixel location and the performance of the filter which is used on a filtering mask.
Linear Filters: Linear filters are used to remove certain type of noise. Gaussian or Averaging filters are suitable for this purpose. These filters also tend to blur the sharp edges, destroy the lines and other fine details of image, and perform badly in the presence of signal dependent noise[4].
Non-Linear Filters: In recent years, a variety of non-linear median type filters such as rank conditioned, weighted median, relaxed median, rank selection have been developed to overcome the shortcoming of linear filter. Different Type of Linear and Non-Linear Filters: Mean Filter: The mean filter is a simple spatial filter .It is a sliding-window filter that replace the center value in the window. It replaces with the average mean of all the pixel values in the kernel or window. The window is usually square but it can be of any shape.
Advantage:
a. Easy to implement
b. Used to remove the impulse noise
Disadvantage:
It does not preserve details of image. Some details are removes of image with using the mean filter.
Median Filter: Median Filter is a simple and powerful non-linear filter which is based order statistics. It is easy to implement method of smoothing images. Median filter is used for reducing the amount of intensity variation between one pixel and the other pixel. In this filter, we do not replace the pixel value of image with the mean of all neighboring pixel values, we replaces it with the median value. Then the median is calculated by first sorting all the pixel values into ascending order and then replace the pixel being calculated with the middle pixel value. If the neighboring pixel of image which is to be consider contain an even numbers of pixels, than the average of the two middle pixel values is used to replace. The median filter gives best result when the impulse noise percentage is less than 0.1 %. When the quantity of impulse noise is increased the median filter not gives best result.
FREQUENCY DOMAIN TECHNIQUES
Frequency domain techniques are based on the manipulation of the orthogonal transform of the image rather than the image itself. Frequency domain techniques are suited for processing the image according to the frequency content. The principle behind the frequency domain methods of image enhancement consists of computing a 2-D discrete unitary transform of the image, for instance the 2-D DFT, manipulating the transform coefficients by an operator M, and then performing the inverse transform. The orthogonal transform of the image has two components magnitude and phase. The magnitude consists of the frequency content of the image. The phase is used to restore the image back to the spatial domain. Theusual orthogonal transforms are discrete cosine transform, discrete Fourier transform, Hartley Transform etc. The transform domain enables operation on the frequency content of the image, and therefore high frequency content such as edges and other subtle information can easily be enhanced. Frequency domain which operate on the Fourier transform of an image. • Edges and sharp transitions (e.g. noise) in an image contribute significantly to high frequency content of Fourier transform. • Low frequency contents in the Fourier transform are responsible to the general appearance of the image over smooth areas. The concept of filtering is easier to visualize in the frequency domain. Therefore, enhancement of image f(x, y) can be done in the frequency domain based on DFT. This is particularly useful in convolution if the spatial extent of the point spread sequence h(x, y) is large then convolution theory. g(x, y)= h(x, y) f(x, y) where g(x, y) is enhanced image.
Medical imaging uses this for reducing noise and sharpening details to improve the visual representation of the image[4].
6. Image Segmentation
Image segmentation refers to the process of partitioning a digital image into multiple segments i.e. set of pixels, pixels in a region are similar according to some homogeneity criteria such as colour, intensity or texture, so as to locate and identify objects and boundaries in an image [1]. Practical application of image segmentation range from filtering of noisy images, medical applications (Locate tumors and other pathologies, Measure tissue volumes, Computer guided surgery, Diagnosis, Treatment planning, study of anatomical structure), Locate objects in satellite images (roads, forests, etc.), Face Recognition, Finger print Recognition, etc. Many segmentation methods have been proposed in the literature. The choice of a segmentation technique over another and the level of segmentation are decided by the particular type of image and characteristics of the problem being considered.In medical imaging, segmentation is an important analysis function for which lots of algorithms and methods have been built up. Variability of data is quite high in medical image processing especially for analysing anatomical structure and tissue types; hence segmentation techniques that provide flexibility, accuracy and convenient automation are of paramount importance.
The role of segmentation is to subdivide the objects in an image; in case of medical image segmentation the aim is to:
• Study anatomical structure
• Identify Region of Interest i.e. locate tumor, lesion and other abnormalities
• Measure tissue volume to measure growth of tumor (also decrease in size of tumor with treatment)
• Help in treatment planning prior to radiation therapy; in radiation dose calculation
Automatic segmentation of medical images is a difficult task as medical images are complex in nature and rarely have any simple linear feature. Further, the output of segmentation algorithm is affected due to
• partial volume effect.
• intensity inhomogeneity
• presence of artifacts
• closeness in gray level of different soft tissue
In recent years, a lot of research is done in the field of image segmentation process. There are currently thousands of algorithm, each doing the segmentation process slightly different from another, but still there is no particular algorithm that is applicable for all types of digital image, fulfilling every objective. Thus, algorithm developed for a group of images may not always apply to images of another class. Currently image segmentation approach, based on two properties of an image, is divided into two categories:
• Discontinuities based
In this category, subdivision of images are carried out on the basis of abrupt changes in the intensity of grey levels of an image. Our focus is primarily based on identification of isolated points, lines and edges. This include image segmentation algorithms like edge detection.
• Similarities based
In this category, subdivision of images are carried out on the basis of similarities in intensity or grey levels of an image. Our focus here is on identification of similar points, lines and edges. This includes image segmentation algorithms like thresholding, region growing, region splitting and merging.
A. Segmentation Based on Edge Detection
This method attempts to resolve image segmentation by detecting the edges or pixels between different regions that have rapid transition in intensity are extracted [1, 5] and linked to form closed object boundaries. The result is a binary image [2]. Based on theory there are two main edge based segmentation methods- gray histogram and gradient based method [4].
1. Gray Histogram Technique
The result of edge detection technique depends mainly on selection of threshold T, and it is really difficult to search for maximum and minimum gray level intensity because gray histogram is uneven for the impact of noise, thus we approximately substitute the curves of object and background with two conic Gaussian curves [4], whose intersection is the valley of histogram. Threshold T is the gray value of intersection point of that valley.
2. Gradient Based Method
Gradient is the first derivative for image f(x, y), when there is abrupt change in intensity near edge and there is little image noise, gradient based method works well. This method involves convolving gradient operators with the image. High value of the gradient magnitude is possible place of rapid transition between two different regions. These are edge pixels, they have to be linked to form closed boundaries of the regions. Common edge detection operators used in gradient based method are sobel operator, cannyoperator, Laplace operator, Laplacian of Gaussian (LOG) operator & so on, canny is most promising one, but takes more time as compared to sobel operator. Edge detection methods requires a balance between detecting accuracy and noise immunity in practice, if the level of detecting accuracy is too high, noise may bring in fake edges making the outline of images unreasonable and if the degree of noise immunity is too excessive [5], some parts of the image outline may get undetected and the position of objects may be mistaken. Thus, edge detection algorithms are suitable for images that are simple and noise-free as well often produce missing edges or extra edges on complex and noisy images [5].
B. Thresholding Method
Image segmentation by thresholding is a simple but powerful approach for segmenting images having light objects on dark background [1]. Thresholding technique is based on imagespace regions i.e. on characteristics of image. Thresholding operation convert a multilevel image into a binary image i.e., it choose a proper threshold T, to divide image pixels into several regions and separate objects from background. Any pixel (x, y) is considered as a part of object if its intensity is greater than or equal to threshold value i.e., f(x, y) ≥T, else pixel belong to background. As per the selection of thresholding value, two types of thresholding methods are in existence, global and local thresholding. When T is constant, the approach is called global thresholding otherwise it is called local thresholding. Global thresholding methods can fail when the background illumination is uneven. In local thresholding, multiple thresholds are used to compensate for uneven illumination [8]. Threshold selection is typically done interactively however, it is possible to derive automatic threshold selection algorithms. Limitation of thresholding method is that, only two classes are generated, and it cannot be applied to multichannel images. In addition, thresholding does not take into account the spatial characteristics of an image due to this it is sensitive to noise, as both of these artifacts corrupt the histogram of the image, making separation more difficult
Otsu's thresholding method involves iterating through all the possible threshold values and calculating a measure of spread for the pixel levels each side of the threshold, i.e. the pixels that either fall in foreground or background. The aim is to find the threshold value where the sum of foreground and background spreads is at its minimum.
C. Region Based Segmentation Methods Compared to edge detection method, segmentation algorithms based on region are relatively simple and more immune to noise. Edge based methods partition an image based on rapid changes in intensity near edges whereas region based methods, partition an image into regions that are similar according to a set of predefined criteria.Segmentation algorithms based on region mainly include following methods:
1. Region Growing
Region growing is a procedure that group’s pixels in whole image into sub regions or larger regions based on predefined criterion. Region growing can be processed in four steps:- (i). Select a group of seed pixels in original image. (ii). Select a set of similarity criterion such as grey level intensity or color and set up a stopping rule. (iii). Grow regions by appending to each seed those neighbouring pixels that have predefined properties similar to seed pixels. (iv). Stop region growing when no more pixels met the criterion for inclusion in that region (i.e. Size, likeness between a candidate pixel & pixel grown so far, shape of the region being grown)
2. Region Splitting and Merging
Rather than choosing seed points, user can divide an image into a set of arbitrary unconnected regions and then merge the regions [2, 4] in an attempt to satisfy the conditions of reasonable image segmentation. Region splitting and merging is usually implemented with theory based on quad tree data. Let R represent the entire image region and select a predicate Q (i). We start with entire image if Q® = FALSE [1], we divide the image into quadrants, if Q is false for any quadrant that is, if Q (Ri ) = FALSE, We subdivide the quadrants into sub quadrants and so on till no further splitting is possible. (ii). If only splitting is used, the final partition may contain adjacent regions with identical properties. This drawback can be remedied by allowing merging as well as splitting i.e. merge any adjacent regions Rj&Rk for which ,Q(Rj U Rk ) = TRUE (iii). Stop when no further merging is possible.
Region merging and splittingalgorithm:
1. Splitting step:
We choose the criteria to split the image based on quad tree. At the same time, we can determine the numbers of splitting levels gradually.
2. Merging step:
If the adjacent regions satisfy the similarity properties, we will merge them.
Repeat step 2 until it is not changed.
C. Theory based Segmentation
This type of image segmentation algorithm include derivatives from different fields and are very important for segmentation approach. They include genetic algorithms, wavelet based algorithms, fuzzy based algorithms, and neural network based algorithms, clustering based algorithms and so on [10]. 1) Clustering Techniques Clustering is an unsupervised learning task, where one needs to identify a finite set of categories known as clusters to classify pixels [11]. A similarity criteria is defined between the pixels and then similar pixels are grouped together to form clusters. Similarity criteria include attribute of an image such as size, color, texture etc. The quality of a cluster depends on both the quality of similarity criteria used and how it is implemented. Clustering methods are classified as hard clustering, k-means clustering, fuzzy clustering, etc. a)Hard Clustering Hard clustering assumes that a pixel can only belong to a single cluster and also that there exists sharp boundaries between clusters. One of the most popular and well used hard clustering algorithm is K-means clustering algorithm [11]. K-mean clustering is a clustering technique group n pixels of an image into K number of clusters, where K < n and K is a positive integer. Initially the centroids of the predefined clusters are initialized randomly. Clusters are formed on the basis of some similarity features like gray level intensity of pixels and distance of pixel intensities. The process is as follows: (i) Randomly choose number of clusters K. (ii) Randomly choose K pixels of different intensities as Centroids. (iii) Centroids are finding out by calculating mean of pixel values in a region. Place Centroids as far away from each other as possible. (iv) Now, compare a pixel to every Centroid and assign pixel to the closest Centroid to form a cluster. When all the pixels have been assigned, initial clustering has been completed (v) Recalculate the mean of each cluster and recalculate the position of Centroids in K clusters. (vi) Repeat steps (iv) & (v) until the Centroids no longer move. b)Fuzzy clustering Fuzzy clustering can be used in situations when there is no defined boundaries between different objects in an image. Fuzzy clustering divides the input pixels into clusters or groups on the basis of some similarity criterion. Similarity criterion can be distance, connectivity, intensity etc. Fuzzy clustering algorithms include FCM (fuzzy C means) algorithm, GK (Gustafson-Kessel), GMD (Gaussian Mixture Decomposition), FCV (Fuzzy C varieties) etc. Fuzzy Clustering Mean algorithm [12] is most accepted since it can preserve much more information than other approaches. In this technique, a dataset is grouped into N clusters with every data point in the dataset belonging to every cluster to a certain degree. 2) Neural Network-based segmentation In this algorithm, an image is firstly mapped into a neural network where every neuron represents a pixel [3] [7]. The neural network is trained with training sample set in order to determine the connection and weights between nodes. Then the new images are segmented with trained neural network. Neural network segmentation includes two important steps: (i) Feature extraction- This step determines the input data of neural network. Some important features from images are extracted that will help in image segmentation (ii) Image segmentation- In this step the image is segmented based on the features extracted from the images Neural network based on segmentation have three basic characteristics: (i) Fast computing and highly parallel computing ability makes it suitable for real time application. (ii) Improve segmentation results when the data deviated from a normal situation. (iii) High robustness makes it immune to noise.
E. Marker-Controlled Watershed Segmentation
In Marker-based watershed segmentation markers are used. A marker is a connected component belonging to an image. The markers include the internal markers, associated with objects of interest, and the external markers, associated with the background. Separating touching objects in an image is one of the more difficult image processing operations. The water shed transform is often applied to this problem. The marker based watershed segmentation can segment unique boundaries from an image. The strength of watershed segmentation is that it produces a unique solution for a particular image. The over-segmentation problem is also removed by marker watershed segmentation. [10]Generally, the watershed transform is computed on the gradient of the original image. It possesses the number of advantages: it is a simple, intuitive method, it is fast and can be parallelized and it produces a complete division of the image in separated regions even if the contrast is poor. An important task was to identify what features must be taken into consideration of a Dicom image for successfully detecting the lung cancer. [11]
Watershed algorithm with using marker:
1. Use a smoothing filter to preprocess the original image, then the action can minimize the large numbers of small spatial details.
2. Use two markers (internal markers and the external markers) to define the criteria of markers.
7. Feature Extraction
This is an important stage that uses algorithms for transforming the input data into the set of features is called feature extraction which is binary image, the only color presented is black and white Thus, only three features were considered to be extracted; area, shape an perimeter. Figure 5 shows the output the results. The features are defined as follows:
1) Area: It is a scalar value that gives the actual number of overall nodule pixel. It is obtained by the summation of areas
of pixel in the image that is registered as 1 in the binary image obtained. Area is defined as :
Area = A= ∑∑ (A i, j X_ROI[Area] = i , Y_ROI[Area] = j)
i j
(1)
Where i, j are the pixels within the shape. ROI is the region of interest. X ROI [] is vector contain ROI x position, Y_ROI [] is vector contain ROI y position [1].
2) Perimeter: It is a scalar value that gives the actual number of the outline of the nodule pixel. It is obtained by the Summation of the interconnected outline of the registered pixel in the binary image. It is defined as:
Perimeter =P= ∑∑(Pi,j ,X edge[P] = i , Y edge[P] = j)
i j
(2)
Where the X_edge[], Y edge[] are vectors represent the coordinate of the ith and jth pixel forming the curve respectively[1]
3) Eccentricity: This matric value or roundness or circularity or irregularity index (I) is to 1 only for circular and it is <1 for any other shape. Here it is assumed that, more circularity of the object [11]. It is defined as:
Eccentricity=(4*Area*pi)/(Perimeter.^2) (3)
8. Software used
Matlab (MATrixLABoratory) is a tool to do numerical computations, display information graphically in 2D and 3D, and solve many other problems in engineering and science. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, Java, and Fortran. Matlab is an interpreted language for numerical computation. It allows one to perform numerical calculations, and visualize the results without the need for complicated and time consuming programming. Matlab allows its users to accurately solve problems, produce graphics easily and produce code efficiently. In this thesis, a program has been developed using MATLAB to load the images, which contained link for all algorithms using pushbutton, pop-up menus and sliders to change the values of the parameters related to the concerned method such as number of regions in region-based methods value of threshold in threshold-based methods.