The purpose of multiple regression is to find a linear equation that can best determine the value of dependent variable Y for different values independent variables in X.
The basic equation of Multiple Regression is –
Y = a + b1X1 + b2X2 + b3X3 + … + bNXN
The value of b1 is the slope of regression line of Y against X1. Same is the case with b2, b3 and so on. These values are then used to minimize the difference between actual and expected value of Y. The difference gives rise to another parameter called Coefficient of Multiple Regression (R2) whose value can range from 0 (for no relationship between Xi and Y) to 1 (perfect relationship between Xi and Y).
Only those independent variables with high values of R2 are included in the equation of multiple regression.
There are mainly two uses of Multiple Regression Equation.
1. For Prediction:
Here the regression equation is used to predict the value of independent variable Y for different values of dependent variables in X.
2. For identification of causes:
Here the regression equation is used to find the nature of relationship between dependent variable and independent variables. Here we can find how the dependent variable changes according to changes in independent variables.
e.g.) Suppose we want to determine various factors affecting the short-listing criteria for the interview of a renowned organization. Let the cumulative score (Y) is determined by the graduation percentage(X1), participation in extra-curricular activities (X2), number of state/national level competitions won(X3), positions of responsibility held(X4) etc. Then the multiple regression function is given by –
Y = 1 + 0.5X1 + 0.2X2 + 0.15X3 + 0.15X4
where coefficients 0.5, 0.2, 0.15, 0.15 are determined by a simple linear regression between Y and each of Xis.