written 2.8 years ago by | modified 2.8 years ago by |
Perform K-means clustering on the given data where K=2. {(1,2), (2,3), (3,4), (4,5), (5,6)} like this
written 2.8 years ago by | modified 2.8 years ago by |
Perform K-means clustering on the given data where K=2. {(1,2), (2,3), (3,4), (4,5), (5,6)} like this
written 2.8 years ago by | • modified 2.8 years ago |
The Given Dataset = {(1,2), (2,3), (3,4), (4,5), (5,6)}
Number of Clusters = K = 2
Step 1 -
So here we chooses 2 random initial cluster centers as C1 = (1, 2), and C2 = (4, 5)
Step 2 -
Here, we calculate the distance by using the Distance Function between two points a = (x1, y1) and b = (x2, y2) as follows:
$$Ρ(a, b) = |x2 – x1| + |y2 – y1|$$
Now, calculate the distance of each point from each of the centers of the 2 clusters. The distance is calculated by using the above-given distance function formula.
The following explanation shows the calculation of distance between the first data point of the given dataset (1, 2) with the centers of the cluster:
1] Calculating Distance Between a = (1, 2) and C1 = (1, 2)
Ρ(a, C1) = |x2 – x1| + |y2 – y1| = |1 – 1| + |2 – 2| = 0
2] Calculating Distance Between a = (1, 2) and C2 = (4, 5)
Ρ(a, C2) = |x2 – x1| + |y2 – y1| = |4 – 1| + |5 – 2| = 3 + 3 = 6
Similarly, now calculate the distance between all other data points from both of the centers of 2 clusters.
Step 3 -
Given Points | Distance from the center (1, 2) of Cluster - 1 | Distance from the center (4, 5) of Cluster - 2 | Point belongs to Cluster |
---|---|---|---|
(1,2) | = |1 – 1| + |2 – 2| = 0 | = |4 – 1| + |5 – 2| = 3 + 3 = 6 | C1 |
(2,3) | = |2 – 1| + |3 – 2| = 1 + 1 = 2 | = |4 – 2| + |5 – 3| = 2 + 2 = 4 | C1 |
(3,4) | = |3 - 1| + |4 – 2| = 2 + 2 = 4 | = |4 – 3| + |5 – 4| = 1 + 1 = 2 | C2 |
(4,5) | = |4 – 1| + |5 – 2| = 3 + 3 = 6 | = |4 – 4| + |5 – 5| = 0 + 0 = 0 | C2 |
(5,6) | = |5 – 1| + |6 – 2| = 4 + 4 = 8 | = |5 – 4| + |6 – 5| = 1 + 1 = 2 | C2 |
Step 4 -
From the above table, we can form 2 clusters are as follows:
Cluster - 1:
The First cluster contains the following 2 data points - {(1, 2), (2, 3)}
Cluster - 2:
The Second cluster contains the following 3 data points - {(3,4), (4,5), (5,6)}
Step 5 -
Now,
For Center of Cluster - 1
X = (1 + 2) / 2 = 3 / 2 = 1.5
Y = (2 + 3) / 2 = 5 / 2 = 2.5
Therefore, C1 = (1.5, 2.5)
For Center of Cluster - 2
X = (3 + 4 + 5) / 3 = 12 / 3 = 4
Y = (4 + 5 + 6) / 3 = 15 / 3 = 5
Therefore, C2 = (4, 5)
This is the completion of Iteration 1.
Again Repeat steps 2 to 5 same as performed in Iteration - 1.
Given Points | Distance from the center (1.5, 2.5) of Cluster - 1 | Distance from the center (4, 5) of Cluster - 2 | Point belongs to Cluster |
---|---|---|---|
(1,2) | = |1.5 – 1| + |2.5 – 2| = 0.5 + 0.5 = 1 | = |4 – 1| + |5 – 2| = 3 + 3 = 6 | C1 |
(2,3) | = |2 – 1.5| + |3 – 2.5| = 0.5 + 0.5 = 1 | = |4 – 2| + |5 – 3| = 2 + 2 = 4 | C1 |
(3,4) | = |3 – 1.5| + |4 – 2.5| = 1.5 + 1.5 = 3 | = |4 – 3| + |5 – 4| = 1 + 1 = 2 | C2 |
(4,5) | = |4 – 1.5| + |5 – 2.5| = 2.5 + 2.5 = 5 | = |4 – 4| + |5 – 5| = 0 + 0 = 0 | C2 |
(5,6) | = |5 – 1.5| + |6 – 2.5| = 3.5 + 3.5 = 7 | = |5 – 4| + |6 – 5| = 1 + 1 = 2 | C2 |
From the above table, we again get the same 2 clusters as follows:
Cluster - 1:
The First cluster contains the following 2 data points - {(1, 2), (2, 3)}
Cluster - 2:
The Second cluster contains the following 3 data points - {(3,4), (4,5), (5,6)}
Now again,
For Center of Cluster - 1
X = (1 + 2) / 2 = 3 / 2 = 1.5
Y = (2 + 3) / 2 = 5 / 2 = 2.5
Therefore, C1 = (1.5, 2.5)
For Center of Cluster - 2
X = (3 + 4 + 5) / 3 = 12 / 3 = 4
Y = (4 + 5 + 6) / 3 = 15 / 3 = 5
Therefore, C2 = (4, 5)
This is the completion of Iteration 2.
Iteration stooped when any of the following conditions are fulfilled.
Here we stopped after the 2 - Iterations because The Center of newly formed clusters does not change and Data points remain present in the same clusters.
After 2 - Iterations we get the 2 - Clusters with their Center Points are as follows:
k1 = {(1,2), (2,3)} and C1 = (1.5, 2.5)
k2 = {(3,4), (4,5), (5,6)} and C2 = (4, 5)