Correlation only indicates the degree and direction of relationship between two variables. It does not, necessarily connote a cause-effect relationship. Even when there are grounds to believe the causal relationship exits, correlation does not tell us which variable is the cause and which, the effect. For example, the demand for a commodity and its price will generally be found to be correlated, but the question whether demand depends on price or vice-versa; will not be answered by correlation.
The dictionary meaning of the ‘regression’ is the act of the returning or going back. The term ‘regression’ was first used by Francis Galton in 1877 while studying the relationship between the heights of fathers and sons.
“Regression is the measure of the average relationship between two or more variables in terms of the original units of data.”
The line of regression is the line, which gives the best estimate to the values of one variable for any specific values of other variables.
For two variables on regression analysis, there are two regression lines. One line as the regression of x on y and other is for regression of y on x.
These two regression line show the average relationship between the two variables. The regression line of y on x gives the most probable value of y for given value of x and the regression line of x and y gives the most probable values of x for the given value of y.
For perfect correlation, positive or negative i.e. for r= ±, the two lines coincide i.e. we will find only one straight line. If r=0, i.e. both the variance are independent then the two lines will cut each other at a right angle. In this case the two lines will be ║to x and y axis.
The Graph is given below:-
We restrict our discussion to linear relationships only that is the equations to be considered are 1- y=a+bx 2- x=a+by
In equation first x is called the independent variable and y the dependent variable. Conditional on the x