The Multiple Linear Regression Model with n independent variables is written as follows:
$$Y = a + b_1X_1 + b_2X_2 + b_3X_3 + ................ + b_nX_n + u$$
Where,
Y = The variable needs to be predicted (dependent variable)
X = The variable used to predict Y (independent variable)
a = The intercept
b = The slope
u = The regression residual
Formulae -
Regression of two independent variables can be predicted by using the below formulas such as Intercepts (a), Regression Coefficients (b1, b2)
$$ Intercepts\ a = \overline Y - b_1(\overline X_1) -b_2(\overline X_2) $$
Regression Coefficients (b1, b2)
$$ b_1 = \frac {(\sum x_2^2)(\sum x_1y) - (\sum x_1x_2)(\sum x_2y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2} $$
$$ b_2 = \frac {(\sum x_1^2)(\sum x_2y) - (\sum x_1x_2)(\sum x_1y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2} $$
Where,
$$ \sum x_1^2 = \sum X_1X_1 - \frac {(\sum X_1)(\sum X_1)}{N} $$
$$ \sum x_2^2 = \sum X_2X_2 - \frac {(\sum X_2)(\sum X_2)}{N} $$
$$ \sum x_1y = \sum X_1Y - \frac {(\sum X_1)(\sum Y)}{N} $$
$$ \sum x_2y = \sum X_2Y - \frac {(\sum X_2)(\sum Y)}{N} $$
$$ \sum x_1x_2 = \sum X_1X_2 - \frac {(\sum X_1)(\sum X_2)}{N} $$
$$ \overline Y = \frac {\sum Y}{N} $$
$$ \overline X_1 = \frac {\sum X_1}{N} $$
$$ \overline X_2 = \frac {\sum X_2}{N} $$
Step 1
First, calculate all the values required in the above formulae.
Subject |
Y |
X1 |
X2 |
X1X2 |
X1X1 |
X2X2 |
X1Y |
X2Y |
1 |
-3.7 |
3 |
8 |
24 |
9 |
64 |
-11.1 |
-29.6 |
2 |
3.5 |
4 |
5 |
20 |
16 |
25 |
14 |
17.5 |
3 |
2.5 |
5 |
7 |
35 |
25 |
49 |
12.5 |
17.5 |
4 |
11.5 |
6 |
3 |
18 |
36 |
9 |
69 |
34.5 |
5 |
5.7 |
2 |
1 |
2 |
4 |
1 |
11.4 |
5.7 |
SUM |
19.5 |
20 |
24 |
99 |
90 |
148 |
95.8 |
45.6 |
Step 2
Then put these values into the above-mentioned formulae to get the exact predictable values to calculate Regression Coefficients b1 and b2
$$ \sum x_1^2 = \sum X_1X_1 - \frac {(\sum X_1)(\sum X_1)}{N} = 90 - \frac {20 \times 20}{5} = 10 $$
$$ \sum x_2^2 = \sum X_2X_2 - \frac {(\sum X_2)(\sum X_2)}{N} = 148 - \frac {24 \times 24}{5} = 32.8 $$
$$ \sum x_1y = \sum X_1Y - \frac {(\sum X_1)(\sum Y)}{N} = 95.8 - \frac {20 \times 19.5}{5} = 17.8 $$
$$ \sum x_2y = \sum X_2Y - \frac {(\sum X_2)(\sum Y)}{N} = 45.6 - \frac {24 \times 19.5}{5} = -\ 48 $$
$$ \sum x_1x_2 = \sum X_1X_2 - \frac {(\sum X_1)(\sum X_2)}{N} = 99 - \frac {20 \times 24}{5} = 3 $$
$$ b_1 = \frac {(\sum x_2^2)(\sum x_1y) - (\sum x_1x_2)(\sum x_2y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2} $$
$$ b_1 = \frac {(32.8 \times 17.8) - (3 \times (-\ 48)}{(10 \times 32.8) - (3)^2} = 2.2816 $$
$$ b_2 = \frac {(\sum x_1^2)(\sum x_2y) - (\sum x_1x_2)(\sum x_1y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2} $$
$$ b_1 = \frac {(10 \times (-\ 48)) - (3 \times 17.8)}{(10 \times 32.8) - (3)^2} = -\ 1.672$$
Step 3
Calculate the value of Intercept a
$$ a = \overline Y - b_1(\overline X_1) -b_2(\overline X_2) = \frac {19.5}{5} - \frac {2.2816 \times 20}{5} - \frac {(-\ 1.672 \times 24)}{5} = 2.796$$
Step 4
The final Regression Equation or Model looks as follows:
$$ Y = 2.796 + 2.28x_1 – 1.67x_2 $$
Therefore, for given x1= 3 and x2 = 2, the value of Y = ? calculated as follows:
$$ Y = 2.796 + (2.28 \times 3) - (1.67 \times 2) $$
$$ Y = 6.296$$