Exercises 2: Finite-Sample Properies of the Least Squares Estimator
When we say that the OLS estimator is "BLU", we are referring to the fact that:
Among all possible estimators of the linear regression coefficient vector that are unbiased, this estimator has the greatest "efficiency", provided that the model satisfies certain assumptions.
Among all possible linear estimators of the regression coefficient vector that are also unbiased, this estimator has the greatest "efficiency", provided that the model satisfies certain assumptions.
Among all possible linear estimators of the regression coefficient vector that are also unbiased, this estimator has the greatest "efficiency".
Among all possible linear estimators of the regression coefficient vector that are also unbiased, this estimator has the greatest "efficiency", provided that the model's error term is normally distributed.
When we say that the usual OLS estimator is a "linear estimator" we mean that:
It is being applied to a regression model that is itself linear.
It is linear in the parameters, but not necessarily linear in the regressors.
It is a linear function of the random sample data - the data for the dependent variable, y, in this case.
It is a linear function of the X matrix.
The connection between the expected value of an estimator and the mean of that estimator's sampling distribution is that:
The expected value of an estimator always exists, but the mean of its sampling distribution may not. When they both exist, they are the same.
There is really no connection in general - they relate to quite different concepts.
They will both be zero if the estimator is unbiased.
They are exactly the same thing.
The usual estimator that we use for the error variance in a linear regresion model (namely, the sum of squared OLS residuals, divided by the degrees of freedom) is:
An unbiased estimator, whose sampling distribution is proportional to a Chi-Square distribution with (n-k) degrees of freedom.
An unbiased estimator.
An unbiased estimator, whose sampling distribution is a Chi-Square distribution, with (n-k) degrees of freedom.
An unbiased estimator, whose sampling distribution is Student-t with (n-k) degrees of freedom.
The OLS estimator of the regession coefficient vector in a linear regression model, and the usual estimator of the variance of the model's error term are:
Positively correlated, because the estimator of the error variance must yield positive values.
Both unbiased estimators.
Statistically independent if all of the assumptions (including normality of the error term) about the model hold.
Statistically independent, as long as the errors are homoskedastic and uncorrelated.
One connection between the variance of an estimator and the mean squared error of that estimator is:
The mean squared error cannot be smaller than the variance.
They will be same if the estimator is linear and unbiased.
They will be same if the estimator is unbiased.
Both A and C.
The diagonal elements of the covariance matrix of an estimator of a vector of parameters are:
The standard deviations of the estimators of the individual elements of the parameter vector.
The variances of the estimators of the individual elements of the parameter vector.
Either positive or negative, depending on whether this matrix is positive definite or negative definite.
Of the same sign as any bias in the estimator for the corresponding parameter element, and therefore zero if this estimator is unbiased.
Suppose that we choose to use the arithmetic mean of the squared OLS residuals as an estimator of the variance of the model's error term. Then:
This estimator is a downward-biased estimator of the error's variance.
This estimator is an upward-biased estimator of the error's variance.
This estimator is biased, and the direction of its bias depends on the degrees of freedom, (n-k).
It is impossible to tell anything about the the bias of this estimator unlesss we know the value of the error's variance.
A statistic that follows a Student-t distribution is one which is constructed in the following way:
By taking the ratio of a the square of a standard normal statistic to a chi-square statistic (that has been already divided by its degrees of freedom), where these 2 statistics are independent of each other.
By taking the ratio of a standard normal statistic to the square root of a chi-square statistic (that has been already divided by its degrees of freedom), where these 2 statistics are independent of each other.
By taking the ratio of the square root of a chi-square statistic (that has been already divided by its degrees of freedom) to a standard normal statistic, where these 2 statistics are independent of each other.
By taking the ratio of the square roots of two independent chi-square statistics.
The correct interpretation of a 95% confidence interval is:
If we were to take an infinite number of samples of the same size, and construct the estimator and the confidence interval in each case, then 95% of the time the true value of the parameter being estimated would lie in one of these intervals.
If we were to take an infinite number of samples of the same size, and construct the estimator and the confidence interval in each case, then 95% of all of these intervals would cover the true value of the parameter being estimated.
There is a 95% chance that the true value of the parameter I am estimating lies in the interval I have constructed from this sample of data.
None of the above.
If we have constructed a 95% (2-sided) confidence interval that covers the value 2.0 for some parameter of interest, then the following is true:
We cannot reject the hypothesis that the parameter is 2.0, at the 5% significance level, against the alternative hypothesis that it is not equal to 2.0.
This interval is consistent with coming to the conclusion that we cannot reject the hypothesis that the parameter is 2.0 (against a 2-sided alternative) if were to obtain a p-value of 0.06.
Both A and B.
This interval is consistent with coming to the conclusion that we cannot reject the hypothesis that the parameter is 2.0 (against a 2-sided alternative) if were to obtain a p-value of 0.04.
I have estimated a regression model by OLS and I calculate a 95% confidence interval for one of the regression coefficients as (0.85, 0.95). Using exactly the same data, I am asked to now calculate a 99% confidence interval, because my supervisor is hoping that this interv al will cover the value 1.00.
This is worth doing, becasuse the new interval must be wider than the original one, and it is possible that it will cover 1.00
This is a waste of time as the new interval must be narrower than the one I have calculated already
This is a waste of time as the new interval must be (0.89, 0.99) and so it won't cover 1.00
This is a waste of time. Although the new interval must be wider than the one I have calculated already, its upper limit cannot be above 0.99
Suppose I am conducting a test and I have in mind an implicit significance level of 5%. My class-mate always uses a significance level of 10% for such tests.The econometrics package I am using reports a p-value of 0.015 for my test.
I would reject the null hypothesis, but my class-mate would not reject it.
My class-mate would reject the null hypothesis, but I would not reject it.
Neither of us would reject the null hypothesis.
My class-mate and I would both reject the null hypothesis.
If a statistical test is "Unbiased", then:
It must also be a "Most Powerful" test.
It will correctly reject false hypotheses, on average.
Its power never falls below the assigned significance level.
Its power improves, on average, as the sample size increases.
When the null hypothesis is true, the Power of a test is:
The probability of rejecting this null hypothesis.
Equal to the significance level.
The probability of a Type II error.
Both A and B.
Check the EViews regression output located here. The following is true with respect to the regressor called P_SA:
We would reject the hypothesis that this coefficient is zero, against the alternative that it is positive, at the 5% significance level, but not at the 1% significance level.
We would reject the hypothesis that this coefficient is zero, against the alternative that it is positive, at the 1% significance level, but not at the 5% significance level.
We would reject the hypothesis that this coefficient is zero, against a 2-sided alternative hypothesis, at both the 5% and 1% significance levels.
We cannot reject the hypothesis that this coefficient is zero, against the alternative that it is positive, at either the 5% significance level or the 1% significance level.
Check the EViews regression output for a confidence ellipse located here. The following is true:
There is a 95% chance that the true values of C(4) and C(5) lie in this ellipse.
Of all of the ellipses of this sort that could be created by re-estimating the model again and again with different samples of the same size, 95% would cover the true values of both C(4) and C(5) at once.
The OLS estimators of C(4) and C(5) are negatively correlated