Part I
The Classical Linear
Regression Model (CLRM) and the OLS Estimator
17/02/2010
Notation y1 y
y2 yn column vector containing the n sample observations on the dependent variable y.
(1)
x 1k xk
x 2k x nk
column vector containing the n sample observations on the independent variable x k , with k 1, 2, . . . , K.
(2)
x1 x2
xK
X
n K data matrix containing the n sample observations on the K independent variables. Usually the vector x 1 is assumed to be a column of 1s (constant).
(3)
1
2
n
column vector containing the n disturbances.
(4)
Assumptions of the CLRM
Assumption 1: linearity
Observed data are generated by the following linear model
′
y i 1 x i1 2 x i2 . . . K x iK i x i i i 1, 2, . . . , n.
(5)
The K unknown parameters of the model can be collected in a column vector,
1
2
(6)
K and the model can be rewritten in compact form: y x 1 1 x 2 2 . . . x K K y X
(7)
Assumption 2: the strict exogeneity assumption
The expected value of each disturbance, i conditional on all observations is zero
E i |X 0
(8)
i 1, 2, . . . , n
In compact form:
E|X 0
(9)
First implication of strict exogeneity
The unconditional mean is also zero. In fact, by the Law of Total
Expectations:
(10)
E i E X E i |X 0 i 1, 2, . . . , n
Second implication of strict exogeneity
The regressors are orthogonal to the error term for all observations: E i x jk 0 i, j 1, 2, . . . , n; k 1, 2, . . . , K
(11)
Third implication of strict exogeneity
The orthogonality conditions are equivalent to zero-correlation conditions Covx jk , i Ex jk i − Ex jk E i 0
(12)
Assumption 3: no (perfect) multicollinearity
The rank of the n K matrix, X, is K with probability 1.
This implies that X has full column rank; the