written 6.9 years ago by | modified 2.6 years ago by |
Subject: Data Mining And Business Intelligence
Topic: Clustering
Difficulty: High
written 6.9 years ago by | modified 2.6 years ago by |
Subject: Data Mining And Business Intelligence
Topic: Clustering
Difficulty: High
written 2.6 years ago by | • modified 2.6 years ago |
The Given Dataset = {(20, 9), (21, 9), (15, 7), (22, 17), (20, 8), (25, 12), (26, 14), (18, 9)}
Number of Clusters = K = 3
Step 1 -
$$C1 = (15, 7), C2 = (22, 17), C3 = (25, 12)$$
Step 2 -
$$P(a,b)=|x2–x1|+|y2–y1|$$
Step 3 -
Given Points | Distance from center (15, 7) of Cluster - 1 | Distance from center (22, 17) of Cluster - 2 | Distance from center (25, 12) of Cluster - 3 | Point belongs to Cluster |
---|---|---|---|---|
(20, 9) | |20 - 15| + |9 - 7| = 7 | |22 - 20| + |17 - 9| = 10 | |25 - 20| + |12 - 9| = 8 | C1 |
(21, 9) | |21 - 15| + |9 - 7| = 8 | |22 - 21| + |17 - 9| = 9 | |25 - 21| + |12 - 9| = 7 | C3 |
(15, 7) | |15 - 15| + |7 - 7| = 0 | |22 - 15| + |17 - 7| = 17 | |25 - 15| + |12 - 7| = 15 | C1 |
(22, 17) | |22 - 15| + |17 - 7| = 17 | |22 - 22| + |17 - 17| = 0 | |25 - 22| + |17 - 12| = 8 | C2 |
(20, 8) | |20 - 15| + |8 - 7| = 6 | |22 - 20| + |17 - 8| = 11 | |25 - 20| + |12 - 8| = 9 | C1 |
(25, 12) | |25 - 15| + |12 - 7| = 15 | |25 - 22| + |17 - 12| = 8 | |25 - 25| + |12 - 12| = 0 | C3 |
(26, 14) | |26 - 15| + |14 - 7| = 18 | |26 - 22| + |17 - 14| = 7 | |26 - 15| + |14 - 12| = 3 | C3 |
(18, 9) | |18 - 15| + |9 - 7| = 5 | |22 - 18| + |17 - 9| = 12 | |25 - 18| + |12 - 9| = 10 | C1 |
$$ K1 = \{(20, 9), (15, 7), (20, 8), (18, 9)\} $$ $$ K2 = \{(22, 17)\} $$ $$ K3 = \{(21,9), (25, 12), (26, 14)\} $$
Step 4 -
1] Center for Cluster-1 (K1) -
$$ X = \frac {(20 + 15 + 20 + 18)}{4} = 18.25 $$
$$ Y = \frac {(9 + 7 + 8 + 9)}{4} = 8.25 $$
Therefore, new Clsuter Center C1 = (18.25, 8.25)
2] Center for Cluster-2 (K2) -
Cluster-2 (K2) has only 1 data point.
Therefore, new Clsuter Center C2 = (22, 17)
3] Center for Cluster-3 (K3) -
$$ X = \frac {(21 + 25 + 26)}{3} = 24 $$
$$ Y = \frac {(9 + 12 + 14)}{3} = 11.67 $$
Therefore, new Clsuter Center C3 = (24, 11.67)
This is the completion of Iteration 1.
Again Repeat steps 3 and 4 same as performed in Iteration - 1.
Therefore,
Given Points | Distance from center (18.25, 8.25) of Cluster - 1 | Distance from center (22, 17) of Cluster - 2 | Distance from center (24, 11.67) of Cluster - 3 | Point belongs to Cluster |
---|---|---|---|---|
(20, 9) | |20 - 18.25| + |9 - 8.25| = 2.5 | |22 - 20| + |17 - 9| = 10 | |24 - 20| + |11.67 - 9| = 6.67 | C1 |
(21, 9) | |21 - 18.25| + |9 - 8.25| = 3.5 | |22 - 21| + |17 - 9| = 9 | |24 - 21| + |11.67 - 9| = 5.67 | C1 |
(15, 7) | |18.25 - 15| + |8.25 - 7| = 4.5 | |22 - 15| + |17 - 7| = 17 | |24 - 15| + |11.67 - 7| = 13.67 | C1 |
(22, 17) | |22 - 18.25| + |17 - 8.25| = 12.5 | |22 - 22| + |17 - 17| = 0 | |24 - 22| + |17 - 11.67| = 7.33 | C2 |
(20, 8) | |20 - 18.25| + |8.25 - 8| = 2 | |22 - 20| + |17 - 8| = 11 | |24 - 20| + |11.67 - 8| = 7.67 | C1 |
(25, 12) | |25 - 18.25| + |12 - 8.25| = 10.5 | |25 - 22| + |17 - 12| = 8 | |25 - 24| + |12 - 11.67| = 1.33 | C3 |
(26, 14) | |26 - 18.25| + |14 - 8.25| = 13.5 | |26 - 22| + |17 - 14| = 7 | |26 - 24| + |14 - 11.67| = 4.44 | C3 |
(18, 9) | |18.25 - 18| + |9 - 8.25| = 1 | |22 - 18| + |17 - 9| = 12 | |24 - 18| + |11.67 - 9| = 8.67 | C1 |
$$ K1 = \{(20, 9), (15, 7), (20, 8), (18, 9), (21,9)\} $$ $$ K2 = \{(22, 17)\} $$ $$ K3 = \{(25, 12), (26, 14)\} $$
1] Center for Cluster-1 (K1) -
$$ X = \frac {(20 + 15 + 20 + 18 + 21)}{5} = 18.8 $$
$$ Y = \frac {(9 + 7 + 8 + 9 + 9)}{5} = 8.4 $$
Therefore, new Clsuter Center C1 = (18.8, 8.4)
2] Center for Cluster-2 (K2) -
Cluster-2 (K2) has only 1 data point.
Therefore, new Clsuter Center C2 = (22, 17)
3] Center for Cluster-3 (K3) -
$$ X = \frac {(25 + 26)}{2} = 25.5 $$
$$ Y = \frac {(12 + 14)}{2} = 13 $$
Therefore, new Clsuter Center C3 = (25.5, 13)
This is the completion of Iteration 2.
Again Repeat steps 3 and 4 same as performed in Iteration - 1.
Therefore,
Given Points | Distance from center (18.8, 8.4) of Cluster - 1 | Distance from center (22, 17) of Cluster - 2 | Distance from center (25.5, 13) of Cluster - 3 | Point belongs to Cluster |
---|---|---|---|---|
(20, 9) | |20 - 18.8| + |9 - 8.4| = 1.8 | |22 - 20| + |17 - 9| = 10 | |25.5 - 20| + |13 - 9| = 9.5 | C1 |
(21, 9) | |21 - 18.8| + |9 - 8.4| = 2.8 | |22 - 21| + |17 - 9| = 9 | |25.5 - 21| + |13 - 9| = 8.5 | C1 |
(15, 7) | |18.8 - 15| + |8.4 - 7| = 5.2 | |22 - 15| + |17 - 7| = 17 | |25.5 - 15| + |13 - 7| = 16.5 | C1 |
(22, 17) | |22 - 18.8| + |17 - 8.4| = 11.8 | |22 - 22| + |17 - 17| = 0 | |25.5 - 22| + |17 - 13| = 7.5 | C2 |
(20, 8) | |20 - 18.8| + |8.4 - 8| = 1.6 | |22 - 20| + |17 - 8| = 11 | |25.5 - 20| + |13 - 8| = 10.5 | C1 |
(25, 12) | |25 - 18.8| + |12 - 8.4| = 9.8 | |25 - 22| + |17 - 12| = 8 | |25.5 - 25| + |13 - 12| = 1.5 | C3 |
(26, 14) | |26 - 18.8| + |14 - 8.4| = 12.8 | |26 - 22| + |17 - 14| = 7 | |26 - 25.5| + |14 - 13| = 1.5 | C3 |
(18, 9) | |18.8 - 18| + |9 - 8.4| = 1.4 | |22 - 18| + |17 - 9| = 12 | |25.5 - 18| + |13 - 9| = 11.5 | C1 |
$$ K1 = \{(20, 9), (15, 7), (20, 8), (18, 9), (21,9)\} $$ $$ K2 = \{(22, 17)\} $$ $$ K3 = \{(25, 12), (26, 14)\} $$
1] Center for Cluster-1 (K1) -
$$ X = \frac {(20 + 15 + 20 + 18 + 21)}{5} = 18.8 $$
$$ Y = \frac {(9 + 7 + 8 + 9 + 9)}{5} = 8.4 $$
Therefore, new Clsuter Center C1 = (18.8, 8.4)
2] Center for Cluster-2 (K2) -
Cluster-2 (K2) has only 1 data point.
Therefore, new Clsuter Center C2 = (22, 17)
3] Center for Cluster-3 (K3) -
$$ X = \frac {(25 + 26)}{2} = 25.5 $$
$$ Y = \frac {(12 + 14)}{2} = 13 $$
Therefore, new Clsuter Center C3 = (25.5, 13)
This is the completion of Iteration 3.
Here we stopped after the 3 - Iterations because
After 3 - Iterations we get the 3 - Clusters with their Center Points are as follows:
$$ K1 = \{(20, 9), (15, 7), (20, 8), (18, 9), (21,9)\}\ with\ Center\ C1 = (18.8, 8.4) $$ $$ K2 = \{(22, 17)\}\ with\ Center\ C2 = (22, 17) $$ $$ K3 = \{(25, 12), (26, 14)\} \ with\ Center\ C3 = (25.5, 13) $$
Important Point -