K-means Clustering PPT

**study tips** · 15-02-2013, 09:48 AM

K-means Clustering

.ppt

K-means.ppt (Size: 677 KB / Downloads: 61)

Introduction

Partitioning Clustering Approach
a typical clustering analysis approach via partitioning data set iteratively
construct a partition of a data set to produce several non-empty clusters (usually, the number of clusters given in advance)
in principle, partitions achieved via minimising the sum of squared distance to its “representative object” in each cluster
Given a K, find a partition of K clusters to optimise the chosen partitioning criterion
global optimal: exhaustively enumerate all partitions
heuristic method: K-means algorithm
K-means algorithm (MacQueen’67): each cluster is represented by the centre of the cluster and the algorithm converges to stable centres of clusters.

K-mean Algorithm

Initialisation: set seed points
Assign each object to the cluster with the nearest seed point
Compute seed points as the centroids of the clusters of the current partition (the centroid is the centre, i.e., mean point, of the cluster)
Go back to Step 1), stop when no more new assignment

Exercise

For the medicine data set, use K-means with the Manhattan distance
metric for clustering analysis by setting K=2 and initialising seeds as
C1 = A and C2 = C. Answer three questions as follows:
How many steps are needed for convergence?
What are memberships of two clusters?
What are centroids of two clusters after convergence?

How K-mean partitions?

When K centroids are set/fixed,
they partition the whole space
into K subspaces constituting a partitioning.
Changing positions of centroids leads to a new partitioning.
A partitioning amounts to a

Relevant Issues

Efficient in computation
O(tKn), where n is number of objects, K is number of clusters, and t is number of iterations. Normally, K, t << n.
Local optimum
sensitive to initial seed points
converge to a local optimum that may be unwanted solution
Other problems
Need to specify K, the number of clusters, in advance
Unable to handle noisy data and outliers (K-Medoids algorithm)
Not suitable for discovering clusters with non-convex shapes
Applicable only when mean is defined, then what about categorical data? (K-mode algorithm)

Conclusion

K-means algorithm is a simple yet popular method for clustering analysis
Its performance is determined by initialisation and appropriate distance measure
There are several variants of K-means to overcome its weaknesses
K-Medoids: resistance to noise and/or outliers
K-Modes: extension to categorical data clustering analysis
CLARA: dealing with large data sets
Mixture models (EM algorithm): handling uncertainty of clusters

29-04-2013, 12:41 PM

K - means algorithm can't deal with non convex data sets, how to solve it?

**study tips** · 30-04-2013, 09:43 AM

To get full information or details of K-means Clustering please have a look on the pages

https://seminarproject.net/Thread-k-mean...tering-ppt

https://seminarproject.net/Thread-image-...clustering

http://seminarprojectsshowthread.php?mode=linear&tid=49286

if you again feel trouble on K-means Clustering please reply in that page and ask specific fields in K-means Clustering

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Human Computer Interface : Seminar Report and PPT	seminar post	1	1,337	22-09-2017, 11:23 AM Last Post: jaseela123
	4G Broadband : Seminar Report and PPT	study tips	1	1,261	22-09-2017, 11:19 AM Last Post: jaseela123
	Software Life-Cycle Models ppt	seminar flower	1	3,852	22-09-2017, 10:54 AM Last Post: jaseela123
	PPT ON LINUX	project girl	1	1,829	21-09-2017, 03:56 PM Last Post: jaseela123
	Public Key Infrastructure (Digital Certificates and Digital Signatures) PPT	project girl	1	2,364	21-09-2017, 01:18 PM Last Post: jaseela123
	Itanium Processor : Seminar Report and PPT	seminar projects maker	1	1,052	21-09-2017, 12:46 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	Biometric Authentication PPT	project girl	1	1,109	19-09-2017, 02:32 PM Last Post: jaseela123
	Android Interface Definition Language PPT	project girl	1	1,681	19-09-2017, 10:58 AM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.