Information Technology (Semester 6)
TOTAL MARKS: 80
TOTAL TIME: 3 HOURS
(1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Assume data if required.
(4) Figures to the right indicate full marks.
1(a) What is Web Structure mining? What are the techniques used for it?(5 marks)
1(b) Compare OLTP and OLAP(5 marks)
1(c) Consider the transaction database.
TID |
Items |
1 |
A, B, D |
2 |
B, C, D |
3 |
A, B |
4 |
B, D |
5 |
A, B, C, D |
Use apriori algorithm with minimum support of 30%. Find all frequency Item-sets.(5 marks)
1(d) What is multidimensional association rule?(5 marks)
2 A manufacturing company has a huge sales network. To control the sales, it is divided in the regions. Each region has multiple zones. Each zone has different cities. Each sales person is allocated different cities. The object is to track sales figure at different granularity levels of region. Also to count number of products sold. Products are categorized as high end and low end products. Develop BI application, taking into consideration of above granularity levels of region, sales person, Product and the Quarterly, yearly and monthly sales. Also it should predict Zone wise, product-wise sales for subsequent quarters
- Identify facts and dimensions and hence draw information package diagram.
- Design suitable DWH schema.
- Identify suitable DM algorithm for predicting the sales.
- Give justification for all the decisions you have taken for the design.
(20 marks)
3(a) Use data given below . Create Adjacency matrix. Use single Link or Complete Link algorithm to cluster given data set by drawing dendogram.
(10 marks)
3(b) With the neat diagram explaining the process of KDD giving emphasis on selection and pre-processing phase.(10 marks)
4(a) Use K-means algorithm to create 2 clusters.
(10 marks)
4(b) What are the features required in Mining Algorithm to Cluster Stream Data? Explain any clustering algorithm used for Stream Data.(10 marks)
5(a) Explain Data Integration and Transformation w.r.t. Data Warehouse.(10 marks)
5(b) Explain BIRCH algorithm with example.(10 marks)
6(a) What is concept hierarchy? How concept hierarchy is generated for numerical and categorical data?(10 marks)
6(b) Using table given below, create classification model using decision-tree and hence classify following touple : Very high, old?
No. |
Income |
Age |
Own house |
1 |
Very high |
Young |
Yes |
2 |
High |
Medium |
Yes |
3 |
Low |
Young |
Rented |
4 |
High |
Medium |
Yes |
5 |
Very high |
Medium |
Yes |
6 |
Medium |
Young |
Yes |
7 |
High |
Old |
Yes |
8 |
Medium |
Medium |
Rented |
9 |
Low |
Medium |
Rented |
10 |
Low |
Old |
Rented |
11 |
High |
Young |
Yes |
12 |
Medium |
Old |
Rented |
(10 marks)
Answer any two questions.
7(a) What is text mining? Explain different approaches of text mining.(10 marks)
7(b) What is CLICK- STREAM mining?(10 marks)
7(c) What are the applications of Web usage mining? What is Web log? Give typical structure of web log?(10 marks)