written 8.8 years ago by |
Data Warehousing, Mining and Business Intelligence - Dec 2014
Information Technology (Semester 7)
TOTAL MARKS: 100
TOTAL TIME: 3 HOURS
(1) Question 1 is compulsory.
(2) Attempt any four from the remaining questions.
(3) Assume data wherever required.
(4) Figures to the right indicate full marks.
1 (a) What is Web Structure Mining? What are the techiques used for it?(5 marks)
1 (b) Compare OLTP & OLAP.(5 marks)
1 (c) Consider the transaction database in the Appendix B. Use Apriori Algorithm with minimum support of 30% find all Frequent Item-sets.(5 marks)
1 (d) What is multidimensional association rule?(5 marks)
2 A manufacturing Company has a huge sales network. To control the sales, it is divided in the regions. Each region has multiple zones. Each zone has different cities. Each sales person is allocated different cities. The object is to track sales figure at different granularity levels of Region. Also to count number of products sold. Products are caegorized as High end low end products. Develop BI application, taking into consideration of above granularity levels for region, Sales person, Product and the Quarterly, yearly and monthly sales. Also it should predict zone wise product-wise sales for subsequent quarters.
i) Identify facts and dimensions and hence draw information package diagram.
ii) Design suitable DWH schema.
iii) Identify suitable DM algorithm for predicting the sales.
iv) Give justification for all the decisions you have taken for the design.(20 marks)
3 (a) Use Data set in Appendix A Create Adjacency Matrix. Use Single Link OR Complete Link Algorithm to Cluster given data set by drawing Dendogram.(10 marks)
3 (b) With neat diagram oxplaning the process of KDD giving emphasis on selection and pre-processing phase.(10 marks)
4 (a) Use Data set in Appendix A Use K-means algorithm to create two clusters.(10 marks)
4 (b) What are the features required in Mining algorithm to Cluster Stream Data? Explain any Clustering algorithm used for Stream Data.(10 marks)
5 (a) Explain Data Integration and Transformation w.r.t Data Warehouse.(10 marks)
5 (b) Explain BIRCH algorithm with example.(10 marks)
6 (a) What is Concept Hierachy? How Concept Hierachy is generated for numerical and categorical data?(10 marks)
6 (b) Using table in Appendix C, Create classification model using decision-tree and hence classify following tuple. Very High, Old?(10 marks)
Answer any two questions:
7 (a) What is text mining? Explain different approaches of text mining.(10 marks) 7 (b) What is CLICK-STREAM Mining?(10 marks) 7 (c) What are the applications of Web usage mining? What is Web Log? Give typical structure of web log?
Appendix AAppendix B.
TID | Items |
01 | A, B, C |
02 | B, C, D |
03 | A, B |
04 | B, D |
05 | A, B |
Appendix C.
No. | Income | Age | Own House |
1. | Very High | Young | Yes |
2. | High | Medium | Yes |
3. | Low | Young | Rented |
4. | High | Medium | Yes |
5. | Very High | Medium | Yes |
6. | Medium | Young | Yes |
7. | High | Old | yes |
8. | Medium | Medium | Rented |
9. | Low | Medium | Rented |
10. | Low | Old | Rented |
11. | High | Young | Yes |
(10 marks)