| Linguistics 483 Notes |
A. C. Brett acbrett@uvic.ca
Department of Linguistics University of Victoria Clearihue C139 |
The t-statistic is often used to determine, on the basis of a random sample x1, x2, ... , xn of n independent observations of a variable X, whether the mean, mX, of the population of values on X differs from a standard or expected value, me.
Typically, the mean, mX, and variance, sX2, of the population are unknown. The mean, mX, and variance, sX2, of the random sample of observations of X are therefore used to estimate them.
A value for the t-statistic is then obtained as follows:
mX - mewhere
t =
[sX2/n] 1/2
|
||||||||||||||||||||||
Table 1. shows the three possible null hypotheses, H0, regarding the difference between the population mean, mX, and a standard or expected value, me. Also shown are the criteria on the basis of which the decision is made to reject H0 and claim the corresponding alternative, HA.
If the difference between the sample estimate, mX, of the population mean, mX, and the value me is sufficiently different from zero, relative to the standard error of the mean, [sX2/n] 1/2, then one can reject H0 and claim the alternative, HA. In a two-tailed test, however, mX can be less than me, which yields a negative t-value, or mX can be greater mX, which yields a positive t.
It is this fact that leads to the test being called a two-tailed test. The two tails are those regions of the t-distribution that extend to the left and right, and which decrease toward the horizontal axis. The tail that extends to the right toward the more positive values of t is called the upper tail; the tail to the left toward the more negative values is called the lower tail. The t-value obtained in the test can be either in the upper or in the lower tail of the distribution. Thus, if the calculated t is sufficiently positive (it is in the upper tail), or sufficiently negative (it is in the lower tail), one can reject H0, and claim HA.
In practice, when applying a two-tailed test, one examines only the magnitude or absolute value of t, denoted as | t |. The decision to reject H0, and claim HA, is then based upon the magnitude obtained for t relative to a selected critical value of t.
A critical t-value, denoted as ta/2; n, depends on the chosen significance level, a, and the number of degrees of freedom, n. (The division of a by 2 is explained below in the section on critical values.) If | t | is greater than or equal to ta/2; n, one can claim HA.
The corresponding alternative hypothesis in one of the tests is that the population mean, mX, is greater than the value me. In the other test, the alternative is that mX is less than me. Thus, unlike the two-tailed test wherein one is concerned only with whether mX and me are different, in a one-tailed test one is interested in whether mX is either greater than, or less than me.
If the value calculated for t using a sample mean, mX, and variance, sX2, is greater than or equal to the upper (positive tail) critical t-value, ta; n, then one can reject the H0 that mX ≤ me and claim HA that mX > me with the chance of error (in rejecting a true H0) being less than the selected significance level, a.
If the value calculated for t is less than or equal to the lower (negative tail) critical t-value, - ta; n, then one can reject the H0 that mX ≥ me and claim HA that mX < me with the chance of error (in rejecting a true H0) being less than a.
An upper critical value is used in a one-tailed test of the null hypothesis that the mean, mX, of the population of the values of a variable X is less than, or equal to a value me against the alternative that mX is greater than me.
Since the t-distribution is symmetrical about zero, the upper critical value, ta; n, can also be used to test the hypothesis that mX is greater than, or equal to me against the alternative that mX is less than me. In this one-tailed test, the t-value obtained is compared with - ta; n.
An upper critical value is also used for two-tailed tests of the equality of mX and me against the alternative that they differ. In such tests, one is not concerned with whether mX is greater or less than me; one cares only that they be sufficiently different. One claims the alternative if the t-value is either sufficiently negative or sufficiently positive.
In a two-tailed test, one compares the magnitude or absolute value of the t-value with the upper critical value ta/2; n. The probability of obtaining a t-value greater than or equal to ta/2; n is a/2, and the probability of obtaining a t-value less than or equal to -ta/2; n is also a/2. Hence, the probability of either one or the other of these events is a/2 + a/2 = a, the chosen significance level.
Tables of the critical values of the t-distribution are available from a number of sources. One such source is the web site of the Information Technology Laboratory of the US National Institute of Standards and Technology. A table of the upper critical t-value can be found at http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm.
Central Limit Theorem: The distribution of the means of random samples of size n from a population with mean m and variance s2 approaches a normal distribution with mean m and variance s2/n as n increases, regardless of the population distribution, provided only that m and s2 are finite, and s2 is not zero.
For small samples from a normally distributed population, the t-distribution is symmetric, like a normal distribution, but it is platykurtic; that is, it is flattened relative to the normal: the peak height is lower than the highest points of a normal distribution, while the tails are higher and more spread out than those of the normal.
The shape of the t-distribution for smaller sample sizes indicates that larger differences between a sample mean mX and an expected value me are more likely than when the sample size is large and the t-distribution approaches the normal. This result is consonant with our intuition that small samples will yield a less reliable estimate of a population mean than will larger samples.
N-dist.where c-dist. is the square root of the c2-distribution. This relation may be read as stating that
t-dist. =
c-dist.
The requirements that mX - me be normally distributed and that [sX2/n] 1/2 be c distributed can be satisfied for small sample sizes, n, if the following condition is met:
Distribution of mX - me . For small samples from a normally distributed population, the mean, mX, of the sample is normally distributed because the sum of normally distributed values is normally distributed, and a normally distributed value multiplied by a constant, such as 1/n, is normally distributed. The subtraction of a constant, such as me, from a normally distributed value, such as the sample mean, yields a normally distributed value. Hence, the difference mX - me, is normally distributed.
Distribution of [sX2/n] 1/2 . The difference between normally distributed values is normally distributed so that, for each observation in a sample from a normally distributed population, the difference xi - mX is normally distributed. Multiplication of a normally distributed values by a constant, such as 1/(n - 1) 1/2, yields a normally distributed value. Hence, (xi - mX)/ (n - 1)1/2 is normally distributed for each value xi in the sample. As was observed in the note on Contingency Tables and the c 2 Distribution, the sum of squared normally distributed values has a c2 distribution. Thus, the sample variance, sX2, has a c2 distribution with, in this case, n - 1 degrees of freedom. Taking the square root of a c2 distribution yields a c.
signalwhere
t =
noise
The signal must be sufficiently large relative to the noise in order that
it be detected.
For largish samples, with n about 60, the signal must be about twice the
magnitude of the noise (because the critical t-value for a two-tailed test
with a = 0.05 is 2.000).
Thus, if the signal = 1.0, the noise cannot be larger than 0.5.
If the noise were larger than 0.5, but the signal magnitude or strength remains
unchanged, then the signal cannot be detected above the noise.
For example, if the noise were 10% greater, that is noise = 0.55, but the
signal strength is still 1.0, the signal to noise ratio become 1.82,
and the signal is no longer detectable above the noise (because 1.82 is less than
the
A brief biography of Gosset is available at http://william-sealey-gosset.brainsip.com/ which includes the link http://www.york.ac.uk/depts/maths/histstat/student.pdf to Gosset's original paper: "Student" (1908). The probable error of a mean, Biometrika, Vol. 6, No. 1. (Mar.), pp. 1-25.
It is the platykurtic (flattened) shape of the t-distribution that was of interest to Gosset and which motivated him to devise the t-statistic. He was concerned that use of the normal distribution with small samples would lead to a conclusion that a sample mean mX differed from a standard value me when, in fact, the observed difference was to be expected for a small sample size n.
One might imagine that Gossett would be testing the alcohol concentration of a batch of beer. Typically, he might make only five measurements of the alcohol level so that his sample size, n = 5. Suppose he calculated a statistic such as t; that is, he took the difference between the mean of his sample and the standard alcohol level, and he divided this by the standard error of the mean to obtain a value of 2.5. But, instead of using the critical values of t (because they didn't exist) to test whether his sample mean was significantly different from the standard, he used the critical values of a normal distribution. He would have applied a two-tailed test because an alcohol level either significantly greater than, or significantly less than the standard value would be undesirable. Therefore, if he employed a significance level, a = 0.05, he would have compared his statistic value of 2.5 with the 0.025 (= a/2 = 0.05/2) upper critical value of the normal distribution. This critical value is 1.960, which is less than 2.5, so that Gossett would have rejected the hypothesis that his sample mean was equal to the standard. Consequently, he would have had to take remedial measures with the batch of beer, and in the worst case, he would have discarded it.
Remedial measures to correct the alcohol level of the batch would have been costly, and perhaps unnecessary. After devising the t-statistic, Gossett would have looked up the 0.025 (= a/2 = 0.05/2) upper critical value of the t-distribution for n = 4 degrees of freedom and obtained 2.776. Comparing this with his statistic value of 2.5 would have led him to the conclusion that he could not reject the null hypothesis that his sample mean was equal to the standard alcohol level. Note that his failure to reject the null hypothesis does not mean that he could actually have claimed that his sample mean was equal to the standard. This result meant only that the evidence was insufficient to reject the null hypothesis. Consequently, there was no need to take costly remedial measures to correct the alcohol level of the batch of beer.
| Linguistics 483 | Home Page | Top of Page |