written 2.5 years ago by |
Clustering :-
Clustering is the process of grouping a set of data objects into multiple groups or clusters so that objects within a cluster have high similarity, but are very dissimilar to objects in other clusters.
Dissimilarities and similarities are assessed based on the attribute values describing the objects and often involve distance measures.
The "quality" of a cluster may be represented by its diameter, the maximum distance between any two objects in the cluster.
Centroid distance is an alternative measure of cluster quality and is defined as the average distance of each cluster object from the cluster centroid.
Cluster analysis or simply clustering is the process of partitioning a set of data objects (or observations) into subsets.
The set of clusters resulting from a cluster analysis can be referred to as a clustering.
Clustering can lead to the discovery of previously unknown groups within the data.
Cluster analysis has been widely used in many applications such as business intelligence, image pattern recognition, Web search, biology, and security.