written 6.0 years ago by | modified 5.7 years ago by |
Mumbai University > Computer Engineering > Sem 8 > Parallel & Distributed System
written 6.0 years ago by | modified 5.7 years ago by |
Mumbai University > Computer Engineering > Sem 8 > Parallel & Distributed System
written 5.7 years ago by | • modified 5.7 years ago |
Parallel algorithm for matrix multiplication-
A = $\begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{bmatrix}$
B = $\begin{bmatrix} b_{11} & b_{12} & b_{13} & b_{14} \\ b_{21} & b_{22} & b_{23} & b_{24} \\ b_{31} & b_{32} & b_{33} & b_{34} \\ b_{41} & b_{42} & b_{43} & b_{44} \end{bmatrix}$
A*B = C
C = $\begin{bmatrix} c_{11} & c_{12} & c_{13} & c_{14} \\ c_{21} & c_{22} & c_{23} & c_{24} \\ c_{31} & c_{32} & c_{33} & c_{34} \\ c_{41} & c_{42} & c_{43} & c_{44} \end{bmatrix}$
Step 1: Initialization:
Distribute $i^{th}$ row and $i^{th}$ column of matrix A & B to $i^{th}$ PE.
Step 2: Do N times
a. Take inner product of two vector
b. Rotate vectors of B columns by one PE position
Each inner product generated represents an element of each row of resultant matrix.
Each PE computes inner product of two vectors available to it.
Example: Step 2(1) a: inner product
$PE_{1} $ | $PE_{2} $ | $PE_{3} $ | $PE_{4} $ |
---|---|---|---|
$a_{11}b_{11} \\ a_{12}b_{21} \\ a_{13}b_{31} \\ a_{14}b_{41} \\ c_{11} $ | $a_{21}b_{12} \\ a_{22}b_{22} \\ a_{23}b_{32} \\ a_{24}b_{42} \\ c_{22} $ | $a_{31}b_{13} \\ a_{32}b_{23} \\ a_{33}b_{33} \\ a_{34}b_{43} \\ c_{33} $ | $a_{41}b_{14} \\ a_{42}b_{24} \\ a_{43}b_{34} \\ a_{44}b_{44} \\ c_{44} $ |
Step 2(1)b : Rotate column of B left by 1 position.
Step 2(2)a : Inner product
$PE_{1} $ | $PE_{2} $ | $PE_{3} $ | $PE_{4} $ |
---|---|---|---|
$a_{11}b_{12} \\ a_{12}b_{22} \\ a_{13}b_{32} \\ a_{14}b_{42} $ | $a_{21}b_{13} \\ a_{22}b_{23} \\ a_{23}b_{33} \\ a_{24}b_{43} $ | $a_{31}b_{14} \\ a_{32}b_{24} \\ a_{33}b_{34} \\ a_{34}b_{44} $ | $a_{41}b_{11} \\ a_{42}b_{22} \\ a_{43}b_{33} \\ a_{44}b_{44} $ |
Results (including previously calculation):
$\begin{bmatrix} c_{11} & c_{22} & c_{33} & c_{44} \\ c_{12} & c_{23} & c_{34} & c_{41} \end{bmatrix}$
Step 2(2)b: Rotate column of B lef by 1 position.
Step 2(3)a: Inner product
$PE_{1} $ | $PE_{2} $ | $PE_{3} $ | $PE_{4} $ |
---|---|---|---|
$a_{11}b_{14} \\ a_{12}b_{24} \\ a_{13}b_{34} \\ a_{14}b_{44} $ | $a_{21}b_{11} \\ a_{22}b_{21} \\ a_{23}b_{31} \\ a_{24}b_{41} $ | $a_{31}b_{12} \\ a_{32}b_{22} \\ a_{33}b_{32} \\ a_{34}b_{42} $ | $a_{41}b_{13} \\ a_{42}b_{23} \\ a_{43}b_{33} \\ a_{44}b_{43} $ |
Results:(including previously calculation):
$\begin{bmatrix} c_{11} & c_{22} & c_{33} & c_{44} \\ c_{12} & c_{23} & c_{34} & c_{41} \\ c_{13} & c_{24} & c_{31} & c_{42} \end{bmatrix}$
Step 2(3)b: Rotate column of B left by 1 position.
Step 2(4)a: Inner Product
$PE_{1} $ | $PE_{2} $ | $PE_{3} $ | $PE_{4} $ |
---|---|---|---|
$a_{11}b_{14} \\ a_{12}b_{24} \\ a_{13}b_{34} \\ a_{14}b_{44} $ | $a_{21}b_{11} \\ a_{22}b_{21} \\ a_{23}b_{31} \\ a_{24}b_{41} $ | $a_{31}b_{12} \\ a_{32}b_{22} \\ a_{33}b_{32} \\ a_{34}b_{42} $ | $a_{41}b_{13} \\ a_{42}b_{23} \\ a_{43}b_{33} \\ a_{44}b_{43} $ |
Results:(including previously calculation):
$\begin{bmatrix} c_{11} & c_{22} & c_{33} & c_{44} \\ c_{12} & c_{23} & c_{34} & c_{41} \\ c_{13} & c_{24} & c_{31} & c_{42} \\ c_{14} & c_{21} & c_{32} & c_{43} \\ \end{bmatrix}$
Final solution: Step 2(4)b: Rotate column of B left by 1 position
Step 2(5)c: Inner product
$PE_{1} $ | $PE_{2} $ | $PE_{3} $ | $PE_{4} $ |
---|---|---|---|
$a_{11}b_{11} \\ a_{12}b_{21} \\ a_{13}b_{31} \\ a_{14}b_{41} \\ $ | $a_{21}b_{12} \\ a_{22}b_{22} \\ a_{23}b_{32} \\ a_{24}b_{42} \\ $ | $a_{31}b_{13} \\ a_{32}b_{23} \\ a_{33}b_{33} \\ a_{34}b_{43} \\ $ | $a_{41}b_{14} \\ a_{42}b_{24} \\ a_{43}b_{34} \\ a_{44}b_{44} \\ $ |
The last routing step is not necessary but if performed shall have the maintain in the original configuration.
The time complexity is-
parallel algorithm is $O(n^2)$
Sequential algorithm is $O(n^2)$