written 8.7 years ago by | • modified 8.7 years ago |
There are 2 Issues in classification:
DATA PREPARATION
The preprocessing steps may be applied to the data for classification and prediction are : Data cleaning ,feature selection, and data transformation.
Data cleaning: This preprocesses the data in order to reduce noise and handle missing values.
Data transformation: it is used to generalize or normalize data.
Relevance analysis: Removes irrelevant or redundant attributes.
EVALUATING CLASSIFICATION METHODS
Hypothesis are used to infer classification of examples in the test set. Accuracy gives percentage of examples in the test set that are classified correctly.
Other attributes used to evaluate classification methods:
Speed and Scalability: Time to construct the model and also time to use the model.
Robustness: This is the ability of the classifier to make correct predictions given noisy data or data with missing values
Scalability: This refers to the ability to construct the classifier efficiently given large amounts of data.
Interpretability: This refers to the level of understanding and insight that is provided by the classifier
Goodness of rules: Decision tree size compactness of classification rules.