written 5.7 years ago by | • modified 5.7 years ago |
i) Architecture of a typical data mining system
1. Database, data warehouse, or other information repository
These are information repositories. Data cleaning and data integration techniques may be performed on the data.
2. Databases or data warehouse server
It fetches the data as per the user's requirement which is need for data mining task.
3. Knowledge base
This is used to guide the search, and gives interesting and hidden patterns from data.
4. Data mining engine
It performs the data mining task such as characterization, association, classification, cluster analysis etc.
5. Pattern evaluation module
It is integrated with the mining module and it helps in searching only the interesting patterns.
6. Graphical user interface
This module is used to communicate between user and the data mining system and allow users to browse database or data warehouse schemas.
ii) Application and major issues in Data Mining
- Data Mining has been used in numerous areas, which include both private as well as public sectors.
- The use of Data mining in major industry areas like Banking, Retail, Medicine, insurance can help reduce costs, increase. their sales and enhance research and development.
- For example in banking sector data mining can be used for customer retention, fraud prevention by credit card approval and fraud detection.
- Prediction models can be developed to help analyse data collected over years. For e.g. customer data can used to find out whether the customer can avail loan from the bank, or an accident claim is fraudulent and needs further investigation.
- Effectiveness of a medicine or certain procedure may be predicted in medical domain by using data mining.
- Data mining can be used in Pharmaceutical firms as a guide to research on new treatments for diseases, by analysing chemical compounds and genetic materials.
- A large amount of data in retail industry like purchasing history, transportation services may be collected for analysis purpose. This data can help multidimensional analysis, sales campaign effectiveness, customer retention and recommendation of products and much more.
- Telecommunication industry also uses data mining, for e.g. they may do analysis based on the customer data which of them are likely to remain as subscribers and which one will shift to competitors.
Major issues in Data Mining:
(1) Mining methodology and user interaction issues:
- Mining different kinds of knowledge in database.
- Interactive mining of knowledge at multiple levels of abstraction.
- Incorporation of background knowledge.
- Data mining query language and ad hoc data mining.
- Presentation and visualization of data mining results.
- Handling noisy or incomplete data.
- Pattern Evaluation.
Performance issues
- Efficiency and scalability of data mining algorithms.
- Parallel, distributed and incremental mining algorithm.
Issues relating to the diversity of database types
- Handling of relational and complex types of data.
- Mining information from heterogeneous databases and global information