Linguistics 483 - Quantitative Methods Linguistics 483 Notes A. C. Brett acbrett@uvic.ca
Department of Linguistics
University of Victoria
Clearihue C139
Last updated: 26 February 2006

The t-Statistic

The t-statistic is often used to determine, on the basis of a random sample x1, x2, ... , xn of n independent observations of a variable X, whether the mean, mX, of the population of values on X differs from a standard or expected value, me.

Typically, the mean, mX, and variance, sX2, of the population are unknown. The mean, mX, and variance, sX2, of the random sample of observations of X are therefore used to estimate them.

A value for the t-statistic is then obtained as follows:

          mX - me
t = —————
        [sX2/n] 1/2
where

Table 1. Null hypotheses, alternative hypotheses, and decision criteria for 1- and 2-tailed t-tests of the difference between the mean, mX, of the population of values of a variable X and a standard or expected value, me, on the basis of a random sample of n independent observations of X.


Test Hypotheses
Decision Criterion
Null, H0 Alternative, HA

2-Tailed mX - me = 0      (mX = me) mX - me ≠ 0      (mXme) | t | ≥ ta/2; n
1-Tailed mX - me ≤ 0      (mXme) mX - me > 0      (mX > me) tta; n
1-Tailed mX - me ≥ 0      (mXme) mX - me < 0      (mX < me) t ≤ - ta; n

Table 1. shows the three possible null hypotheses, H0, regarding the difference between the population mean, mX, and a standard or expected value, me. Also shown are the criteria on the basis of which the decision is made to reject H0 and claim the corresponding alternative, HA.

Two-Tailed Test

A two-tailed t-test, or "two-sided" test as it is sometimes called, is associated with the null hypothesis, H0, that the difference between the population mean, mX, and a standard or expected value, me, is zero (or equivalently, mX equals me). The alternative hypothesis, HA, of a two-tailed test is that the difference between the population mean, mX, and a standard or expected value, me, is not zero (or equivalently, mX does not equal me).

If the difference between the sample estimate, mX, of the population mean, mX, and the value me is sufficiently different from zero, relative to the standard error of the mean, [sX2/n] 1/2, then one can reject H0 and claim the alternative, HA. In a two-tailed test, however, mX can be less than me, which yields a negative t-value, or mX can be greater mX, which yields a positive t.

It is this fact that leads to the test being called a two-tailed test. The two tails are those regions of the t-distribution that extend to the left and right, and which decrease toward the horizontal axis. The tail that extends to the right toward the more positive values of t is called the upper tail; the tail to the left toward the more negative values is called the lower tail. The t-value obtained in the test can be either in the upper or in the lower tail of the distribution. Thus, if the calculated t is sufficiently positive (it is in the upper tail), or sufficiently negative (it is in the lower tail), one can reject H0, and claim HA.

In practice, when applying a two-tailed test, one examines only the magnitude or absolute value of t, denoted as | t |. The decision to reject H0, and claim HA, is then based upon the magnitude obtained for t relative to a selected critical value of t.

A critical t-value, denoted as ta/2; n, depends on the chosen significance level, a, and the number of degrees of freedom, n. (The division of a by 2 is explained below in the section on critical values.) If | t | is greater than or equal to ta/2; n, one can claim HA.

One-Tailed Test

The one-tailed t-tests are associated with the null hypotheses that the difference between mX and me is less than or equal to zero, or that the difference between mX and me is greater than or equal to zero. Equivalently, one of the two null hypotheses is that the the population mean, mX, is less than or equal to the value me. The other null hypothesis is that mX is either greater than or equal to me.

The corresponding alternative hypothesis in one of the tests is that the population mean, mX, is greater than the value me. In the other test, the alternative is that mX is less than me. Thus, unlike the two-tailed test wherein one is concerned only with whether mX and me are different, in a one-tailed test one is interested in whether mX is either greater than, or less than me.

If the value calculated for t using a sample mean, mX, and variance, sX2, is greater than or equal to the upper (positive tail) critical t-value, ta; n, then one can reject the H0 that mXme and claim HA that mX > me with the chance of error (in rejecting a true H0) being less than the selected significance level, a.

If the value calculated for t is less than or equal to the lower (negative tail) critical t-value, - ta; n, then one can reject the H0 that mXme and claim HA that mX < me with the chance of error (in rejecting a true H0) being less than a.

Critical Values

An upper critical t-value, ta; n, is a value of the t-statistic such that, for a given number of degrees of freedom, n, the probability of obtaining by chance alone a value of t greater than or equal to ta; n is less than or equal to the significance level a.

An upper critical value is used in a one-tailed test of the null hypothesis that the mean, mX, of the population of the values of a variable X is less than, or equal to a value me against the alternative that mX is greater than me.

Since the t-distribution is symmetrical about zero, the upper critical value, ta; n, can also be used to test the hypothesis that mX is greater than, or equal to me against the alternative that mX is less than me. In this one-tailed test, the t-value obtained is compared with - ta; n.

An upper critical value is also used for two-tailed tests of the equality of mX and me against the alternative that they differ. In such tests, one is not concerned with whether mX is greater or less than me; one cares only that they be sufficiently different. One claims the alternative if the t-value is either sufficiently negative or sufficiently positive.

In a two-tailed test, one compares the magnitude or absolute value of the t-value with the upper critical value ta/2; n. The probability of obtaining a t-value greater than or equal to ta/2; n is a/2, and the probability of obtaining a t-value less than or equal to -ta/2; n is also a/2. Hence, the probability of either one or the other of these events is a/2 + a/2 = a, the chosen significance level.

Tables of the critical values of the t-distribution are available from a number of sources. One such source is the web site of the Information Technology Laboratory of the US National Institute of Standards and Technology. A table of the upper critical t-value can be found at http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm.

Degrees of Freedom

The distribution of the values of t depends upon the number of degrees of freedom, n, in the sample variance, sX2. For a sample of n independent observations of X, the number of degrees of freedom would be n, the number of terms in the summation used to compute sX2, except for the fact that the mean mX has been estimated from the sample. The computation of mX reduces the number of degrees of freedom by 1 so that the actual number of degrees of freedom, n = n - 1.

Central Limit Theorem

The distribution of the values of t depends upon the sample size n (or more precisely, on the number of degrees of freedom, n = n - 1). As n increases, the t-distribution converges to a normal distribution. This convergence to normal is a corollary of the Central Limit Theorem.

Central Limit Theorem: The distribution of the means of random samples of size n from a population with mean m and variance s2 approaches a normal distribution with mean m and variance s2/n as n increases, regardless of the population distribution, provided only that m and s2 are finite, and s2 is not zero.

For small samples from a normally distributed population, the t-distribution is symmetric, like a normal distribution, but it is platykurtic; that is, it is flattened relative to the normal: the peak height is lower than the highest points of a normal distribution, while the tails are higher and more spread out than those of the normal.

The shape of the t-distribution for smaller sample sizes indicates that larger differences between a sample mean mX and an expected value me are more likely than when the sample size is large and the t-distribution approaches the normal. This result is consonant with our intuition that small samples will yield a less reliable estimate of a population mean than will larger samples.

The Relation of the t-Distribution to the Normal and c2-Distributions

The t-distribution is related to the normal and to the c2-distributions. The relation among these distributions can be represented symbolically as follows:
                    N-dist.
t-dist.  =  ————
                    c-dist.
where c-dist. is the square root of the c2-distribution. This relation may be read as stating that then

The requirements that mX - me be normally distributed and that [sX2/n] 1/2 be c distributed can be satisfied for small sample sizes, n, if the following condition is met:

It follows from this condition that the values comprising a random sample x1, x2, ... , xn of n observations of X are normally distributed.

Distribution of mX - me . For small samples from a normally distributed population, the mean, mX, of the sample is normally distributed because the sum of normally distributed values is normally distributed, and a normally distributed value multiplied by a constant, such as 1/n, is normally distributed. The subtraction of a constant, such as me, from a normally distributed value, such as the sample mean, yields a normally distributed value. Hence, the difference mX - me, is normally distributed.

Distribution of [sX2/n] 1/2 . The difference between normally distributed values is normally distributed so that, for each observation in a sample from a normally distributed population, the difference xi - mX is normally distributed. Multiplication of a normally distributed values by a constant, such as 1/(n - 1) 1/2, yields a normally distributed value. Hence, (xi - mX)/ (n - 1)1/2 is normally distributed for each value xi in the sample. As was observed in the note on Contingency Tables and the c 2 Distribution, the sum of squared normally distributed values has a c2 distribution. Thus, the sample variance, sX2, has a c2 distribution with, in this case, n - 1 degrees of freedom. Taking the square root of a c2 distribution yields a c.

The t-Statistic as a Signal to Noise Ratio

The t-statistic can be regarded as a signal to noise ratio:
       signal
t = ———
        noise
where

The signal must be sufficiently large relative to the noise in order that it be detected. For largish samples, with n about 60, the signal must be about twice the magnitude of the noise (because the critical t-value for a two-tailed test with a = 0.05 is 2.000). Thus, if the signal = 1.0, the noise cannot be larger than 0.5. If the noise were larger than 0.5, but the signal magnitude or strength remains unchanged, then the signal cannot be detected above the noise. For example, if the noise were 10% greater, that is noise = 0.55, but the signal strength is still 1.0, the signal to noise ratio become 1.82, and the signal is no longer detectable above the noise (because 1.82 is less than the critical value of 2.00).

Historical Note

The concept of a statistic to assess "the probable error of a mean" for small samples is attributed to William Sealey Gosset, a chemist at the Arthur Guiness & Son brewery in Dublin. The brewery was concerned about the disclosure of trade secrets so that Gosset was prohibited from publishing the results of his work on a t-statistic. He did nonetheless publish his results; but, to disguise his identity, he used the nom de plume "Student." The t-statistic is therefore know as "Student's t." Notwithstanding the t-statistic is attributed to "Student," the form of the statistic we now use is due to Ronald Fisher.

A brief biography of Gosset is available at http://william-sealey-gosset.brainsip.com/ which includes the link http://www.york.ac.uk/depts/maths/histstat/student.pdf to Gosset's original paper: "Student" (1908). The probable error of a mean, Biometrika, Vol. 6, No. 1. (Mar.), pp. 1-25.

It is the platykurtic (flattened) shape of the t-distribution that was of interest to Gosset and which motivated him to devise the t-statistic. He was concerned that use of the normal distribution with small samples would lead to a conclusion that a sample mean mX differed from a standard value me when, in fact, the observed difference was to be expected for a small sample size n.

One might imagine that Gossett would be testing the alcohol concentration of a batch of beer. Typically, he might make only five measurements of the alcohol level so that his sample size, n = 5. Suppose he calculated a statistic such as t; that is, he took the difference between the mean of his sample and the standard alcohol level, and he divided this by the standard error of the mean to obtain a value of 2.5. But, instead of using the critical values of t (because they didn't exist) to test whether his sample mean was significantly different from the standard, he used the critical values of a normal distribution. He would have applied a two-tailed test because an alcohol level either significantly greater than, or significantly less than the standard value would be undesirable. Therefore, if he employed a significance level, a = 0.05, he would have compared his statistic value of 2.5 with the 0.025 (= a/2 = 0.05/2) upper critical value of the normal distribution. This critical value is 1.960, which is less than 2.5, so that Gossett would have rejected the hypothesis that his sample mean was equal to the standard. Consequently, he would have had to take remedial measures with the batch of beer, and in the worst case, he would have discarded it.

Remedial measures to correct the alcohol level of the batch would have been costly, and perhaps unnecessary. After devising the t-statistic, Gossett would have looked up the 0.025 (= a/2 = 0.05/2) upper critical value of the t-distribution for n = 4 degrees of freedom and obtained 2.776. Comparing this with his statistic value of 2.5 would have led him to the conclusion that he could not reject the null hypothesis that his sample mean was equal to the standard alcohol level. Note that his failure to reject the null hypothesis does not mean that he could actually have claimed that his sample mean was equal to the standard. This result meant only that the evidence was insufficient to reject the null hypothesis. Consequently, there was no need to take costly remedial measures to correct the alcohol level of the batch of beer.

Linguistics 483 Home Page Top of Page