Computer Engineering (Semester 7)
Total marks: 80
Total time: 3 Hours
INSTRUCTIONS
(1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Draw neat diagrams wherever necessary.
1.a.
For the given attribute AGE values : 16, 16, 180, 4, 12, 24, 26, 28, apply following Binning technique for smoothing the noise.
i) Bin Medians
ii) Bin Boundaries
iii) Bin Means
(6 marks)
00
1.b.
Differentiate between Star schema and Snowflake schema.
(6 marks)
00
1.c.
Calculate the Jaccard coefficient between Ram and Hari assuming that
all binary attributes are a symmetric and for each pair values for an attribute, first one is more frequent than the second
| object | gender | Food | Caste | Education|Hobby|Job|
| :------- | ----: | :---: | :------- | ----: | :---: | :------- |
| Hari | M(1) | V(1) | M(0) | L(1) | C(0) | N(0)
| Ram | M(1) | N(0) | M(0) |I(0) | T(1) | N(0)
| Tomi | F(0) | N(0) |H(1) | L(1) | C(0) |Y(1)
(8 marks)
00
OR
2.a.
Explain following attribute types with example.
i) Ordinal
ii) Binary
iii) Nominal
(6 marks)
00
2.b.
Differentiate between OLTP and OLAP with example.
(6 marks)
00
2.c.
Calculate the Euclidean distance matrix for given Data points.
point |
x |
y |
p1 |
0 |
2 |
p2 |
2 |
0 |
p3 |
3 |
1 |
p4 |
5 |
1 |
(8 marks)
00
3.a.
A database has 6 transactions. Let minimum support = 60% and Minimum confidence = 70%
Transaction ID |
Items Bought |
T1 |
{ A,B,C,E} |
T2 |
{A,C,D,E} |
T3 |
{B,C,E} |
T4 |
{A ,C, D, E} |
T5 |
{C, D, E } |
T6 |
{A, D ,E } |
i) Find Closed frequent Itemsets
ii) Find Maximal frequent itemsets
iii) Design FP Tree using FP growth algorithm
(8 marks)
00
3.b.
Explain with example Multi level and Constraint based association Rule
mining.
(5 marks)
00
3.b.
How can we improve the efficiency of a-priori algorithm.
(4 marks)
00
OR
4.a.
Consider the Market basket transactions shown below. Assuming the
minimum support = 50% and Minimum confidence = 80%
i) Find all frequent item sets using Apriori algorithm
ii) Find all association rules using Apriori algorithm
Transaction ID |
Items Bought |
T1 |
{Mango,Apple,Banana,Dates} |
T2 |
{Apples,Dates,Coconuut,Banana,Fig} |
T3 |
{Apple,Coconut,Banana,Fig} |
T4 |
{Apple,banana,Dates} |
(8 marks)
00
4.b.
Explain FP growth algorithm with example.
(5 marks)
00
4.b.
Explain following measures used in association Rule mining
i) Minimum Support
ii) Minimum Confidence
iii) Support
iv) Confidence
(4 marks)
00
5.a.
Explain the training and testing phase using Decision Tree in detail. Support your answer with relevant example.
(8 marks)
00
5.b.
Apply KNN algorithm to find class of new tissue paper (X1 = 3,
X2 = 7). Assume K = 3
X1 =Acid Durability (secs) |
X2 = Strength(kg/sq.meter) |
Y= Classification |
7 |
7 |
Bad |
7 |
4 |
Bad |
3 |
4 |
Good |
1 |
4 |
Good |
(5 marks)
00
5.c.
Explain the use of regression model in prediction of real estate prices.
(4 marks)
00
OR
6.a.
What is Bayesian Belief Network. Elaborate the training process of a Bayesian Belief Network with suitable example.
(8 marks)
00
6.b.
Explain K-nearest neighbor classifier algorithm with suitable application.
(5 marks)
00
6.c.
Elaborate on Associative Classification with appropriate applications
(4 marks)
00
7.a.
Discuss the Sequential Covering algorithm in detail.
(8 marks)
00
7.b.
Explain following measures for evaluating classifier accuracy
i) Specificity
ii) Sensitivity
(4 marks)
00
7.c.
Differentiate between Wholistic learning and Multi perspective learning
(4 marks)
00
OR
8.a.
How is the performance of Classifiers algorithms evaluated. Discuss in detail.
(8 marks)
00
8.b.
Discuss Reinforcement learning relevance and its applications in real
time environment.
(4 marks)
00
8.c.
Explain following measures for evaluating classifier accuracy
i) Recall
ii) Precision
(4 marks)
00