written 7.7 years ago by |
Data Quality:
Data quality is an increasingly serious issue for organizations large and small. It is central to all data integration initiatives. Before data can be used effectively in a data warehouse, or in customer relationship management,enterprise resource planning or business analytics applications, It need to be analyzed and cleansed .
Understanding the key data quality dimensions is the first step to data quality improvement. To be processable and interpretable in an effective and efficient manner, data has to satisfy a set of quality criteria. Data satisfying those quality criteria is said to be of high quality.
Abundant attempts have been made to define data quality and to identify its dimensions. Dimensions of data quality typically include accuracy,reliability,importance,consistency,precision,timeliness,fineness,understandability, conciseness and usefulness.
The quality criteria by taking 6 key dimensions as :
- Completeness
- Consistency
- Validity
- Conformity
- Accuracy
- Integrity
1: Completeness:
deals with to ensure is all the requisite information available? Are some data values missing, or in an unusable state?
2: Consistency:
Do distinct occurrences of the same data instances agree with each other or provide conflicting information. Are values consistent across data sets?
3: Validity:
refers to the correctness and reasonableness of data
4: Conformity:
Are there expectations that data values conform to specified formats? If so, do all the values
5:Accuracy:
Do data objects accurately represent the “real world” values they are expected to model? Incorrect spellings of product or person names, addresses, and even untimely or not current data can impact operational and analytical applications.
6: Integrity:
What data is missing important relationship linkages? The inability to link related records together may actually introduce duplication