0
7.5kviews
Explain sort-merge join algorithm in query processing.

Mumbai University > Computer Engineering > Sem 4 > Database Management System

Marks: 10M

Year: Dec 2014, May 2014

1 Answer
0
153views
  • If the records of R and S are physically sorted by value of the join attributes AandB, respectively, the join can be implemented in the most efficient way possible.
  • Both files are scanned concurrently in order of the join attributes, matching the records that have the same values for A and B.
  • One cursor is maintained per table and the cursor moves forward.
  • If the files are not sorted, they may be sorted first by using external sorting. In this method, pairs of file blocks are copied into memory buffers in order and the records of each file are scanned only once each for matching with the other file unless both A and B are non-key attributes, in which case the method needs to be modified slightly

Sort-Merge Join Algorithm:

R(i) refers to the ith record in file R.

S(j) refers to jth record in file S

sort the tuples in R on attribute A;  //assume R has n tuples (records) 

sort the tuples in S on attribute B;  //assume S has m tuples (records) 

seti ← 1, j ← 1; 
while (i ≤ n) and (j ≤ m) 
do { 
     if R(i)[A] > S(j)[B] 
     then set j ← j + 1 
     elseif R(i)[A] < S(j)[B] 
     then set i ← i + 1 
     else  {
      output the combined tuple <R(i), S(j)> to T;
         //R(i)[A] = S(j)[B], so we output a matched tuple *) 

(* output other tuples that match R(i), if any *) 
set I ← j + 1; 
while (l ≤ m) and (R(i)[A] = S(l)[B]) 
do
{ output the combined tuple <R(i), S(l)> to T; 
set l ← l + 1 
}
(* output other tuples that match S(j), if any *) 
set k ← i + 1; 
while (k ≤ n) and (R(k)[A] = S(j)[B]) 
do
{ output the combined tuple <R(k), S(j)> to T; 
set k ← k + 1 
} 
seti ← k, j ← l
}
}

Best Case Cost: (M+N) when already sorted.

Worst Case Cost: M log M + N log N + (M+N)

Please log in to add an answer.