K-means Clustering Algorithm
K-means is one of the most famous clustering algorithm. The steps for this are :
Algorithm
Step 1:
Determine the number of clusters we want in the final classified result and set the number as N. Randomly select N patterns in the whole data bases as the N centroids of N clusters
Step 2:
Classify each pattern to the closest cluster centroid. The closest usually represent the pixel value is similarity, but it still can consider other features.
Step 3:
Recompute the cluster centroids and then there have N centroids of N clusters as we do after Step1
Step 4:
Repeat the iteration of Step 2 to 3 until a convergence criterion is met. The typical convergence criteria are: no reassignment of any pattern from one cluster to another, or the minimal decrease in squared error.
Advantages
- K-means algorithm is easy to implement
- Its time complexity is O(n), where n is the number of patterns. It is faster than the hierarchical clustering.
disadvantages
- The result is sensitive to the selection of the initial random centroids.
- We cannot show the clustering details as hierarchical clustering does.