Simple linear regression model serves two purposes:
1. It describes the linear dependence of one variable on another
2. It can predict values of one variable from values of another based on historical relationship between independent and dependent variable.
Mathematically the simple linear regression model can be defined as,
For a given a set of n points (Xi,Yi)
Find the best fit line Ŷi = aX + b, such that the sum of squared errors in Y, ∑ (Yi – Ŷi)2 is minimized
Where, Yi= actual value and Ŷi= Predicted value
b=slope of the line = Co-variance(X, Y)/Variance (X)
Example: Let’s look at the data of sales and advertising spend of company ABC for past 5 years. We will try to find relation between Advertising budget and sales, if any.
A scatter plot is drawn for the available data. If the straight line is drawn such that line has the smallest possible set of distances between itself and each data point, then that line become linear regression line.
As we can see from the plot, there is a strong relation between advertising spending and sales achieved.
The simple linear regression model can be expressed by straight line equation as follows,