19-09-2012, 12:19 PM
An analysis on Beta Thalassemia major patients through the techniques of data clustering
An analysis on Beta.doc (Size: 423 KB / Downloads: 33)
Abstract:
In data mining, clustering analysis is a technique for grouping data into related component based on similarity metrics. Integration of fuzzy logic with data mining techniques has become one of the key constituents of soft computing. The k means Algorithm is the best method to cluster the crisp data. In traditional clustering algorithm, one object is assigned in to only one cluster. This is valid till the clusters are disjoint and separate. But if the clusters are touching each other or they are overlapping, then one object can belong to more than one cluster. In this case fuzzy clustering comes in to existence. In this paper the grouping of beta thalassemia major disease is taken as a case study. Thalassemia can lead to severe transfusion-dependent anaemia, and it is the most common genetic disorder in all part of the world especially Countries in the Middle East. Fuzzy c means algorithms is applied for the clustering to the database and the result is discussed in this paper.
INTRODUCTION
The applications of Data mining have become increasingly common in both private and public sectors. The medical community sometimes uses data mining to help and predict the effectiveness of a procedure or medicine [1]. Data mining can be performed on different types databases and data repositories. The various data mining functionalities like classification, prediction, cluster analysis etc are used to find the different kinds of pattern, the result of data mining technique.
This paper is organized as follows. Session 2 gives a brief idea about clustering and gives the comparison about k means and fuzzy C-means algorithm. Session 3 explain the Beta Thalassemia major disease. Finally session 4 discusses the study on the beta thalassemia database after applying the fuzzy c mean clustering algorithm.
Clustering:
Clustering is a process of grouping data objects into different clusters so that data in the same clusters are similar, and data
belongs to different clusters are different [2]. A clustering algorithm aims to learn as much as possible about the data.
Previous work
According to a study done about consanguineous marriages in Morocco [13],the overall prevalence of disease issued from consanguineous marriages reached 66. 22 and 47 percentage among non consanguineous ones (danilo, 2009).The prevalence of b- thalassemia major is especially high in countries where there are close family marriages (Ghosh et al., 2008). Comparison of the prevalence of complications with other reports shows that delay of puberty is higher in Morocco than in the other countries. The prevalenceof diabetes, hearth complications and hy-poparathyroidy varies among countries. (FIT study, Cyprus and Iran) but it is higher in Italy. Presented percent for heart complications (6 percentage () is very lower than other reports and in the study done in Italy (69 percentage) (Gamberini et al., 2004). Deaths investigations show that a patient died because of diabetes and another by heart failure. Death by heart failure was noted in a thalassemia patient
CLINICAL MANIFESTATION OF BETA THALASSEMIA
Clinical sequelae of thalassemia includes: delay in growth and development, deformity of bones due to ectopic marrow expansion, osteopenia and most important iron overload. It is iron overload in tissues that is eventually fatal in patients with or without transfusion dependency if it adequately treated with iron chelating therapy. In absence of chelating therapy, iron accumulates in and damages heart, liver, endocrine glands and reproductive organs. Onset of puberty is delayed and growth stunted.
DISCUSSION AND RESULTS
In all experiments we use MATLAB software as a powerful tool to compute clusters. The fact the number of patients with thalassemia decreases beyond 15 years could be explained by death mostly among children older than 15 years .This can be explained by the fact that if children are not transfused, they die before the age of 6 years and if they are transfused and non-chelated,they die before the age of 20.The clustering of number of patients shows that the age group between 8 and 10 years old are mostly affected by this diease.The mean age is 10=(not equal to)5 years. Beyond 15 years ,the number of cases decreases(Figure 1(x-axis no of patients and y axis age
CONCLUSIONS
The Beta thalassemia database gives a clear idea about the percentage of linear growth by analyzing the different variables. This analysis is done using clustering techniques. The database helps after clustering of different variables Using FCM algorithms to identify the percentage of linear growth of b-thalassemia patients in children. Thalassemia represents a reality in our country. This must be taken as public health problem and a long term policy will allow competent volunteers to reach objectives.