Aim: estimate the causal effect on Y of a unit change in X
Slope: expected change on Y for a unit change in X E[X|Y] = b0 + b1X
Method: minimize the sum of square errors or average squared difference between actual Yi and predicted Yi, min u (OLS), u = error which contains omitted factors that influence Y that is not captured in X and also error in measurement in Y b0 and b1 are population parameter, the hats are the estimates, we pick the hats so that u is minimized
Interpretation: one more unit change in x on average have beta1 effect on y
Measure of fit: 1) R^2 – measure the fraction of variance of Y that is explained by X, between [0,1] = sum ESS/sum TSS = (yhat – ybar_hat)/(yi – y bar)2) SER – measure the magnitude of a typical regression residual in the unit change of Y, measures the spread of the dis of u 3) RMSE is the same as SER but 1/n and not n-2
Assumption on Sampling:
1) E(u|Xi = xi) = 0 implies beta1hat is unbiased, conditional dist of u given X has mean 0 RESULT: E(beta hat) = beta, var(betahat)~1/n
2) (Xi,Yi) are iid – true if sample is random samples, problematic when we have panel data
3) E(X^4) < infinity or outliers are rare, OLS can be sensitive with outliers beta 1 = Sxy/Sx^2 is the object of interest (the causal effect of X on Y) sampling dist of beta 1 is normal when we have n large and thus estimators -> pop parameter in the limits that is it is consistent The larger the var of X, the smaller the var of beta hat 1 Regressor: Hypothesis testing and CI (CHAPTER 5) sampling dist of beta 1 when n is large
Objective: Test various hypotheses H0: beta1 = 0, 1 or 2 side
Method: t-stat, compute p-value and reject or acceptReject at 5% sig level if |t| > 1..96
Binary Variables interpretation: IF var(u|X=x) is constant – that is if the conditional dist of u given X doesn’t depend on X then u is said to be homo ow hetero
IF 3 assumption + homoskedastic u imply beta 1