The ambition is that this article should be self-contained. However some concepts, such as standard deviation, are discussed in more detail in the previous article.
What is a significance test?
You’ve reached the point towards the end of a lab report, and you need to compare your experimentally derived mean value with a literature value. Then you need to comment on the ‘significance’ of the difference between the two. So how do you do that?
This article looks at the Student’s t-Test; a method to assess if the difference between two sets of data is ‘significantly different’.
It’s been written as a practical guide, so I’m not going to include some of the more ‘gory’ statistical details. If you want to learn more about significance testing in a broader sense, I suggest Chapter 4 of “Practical Statistics for the Analytical Scientist” by Ellison, Barwick, and Duguid Farrant. For Edinburgh students, it’s available as a free e-book via the University of Edinburgh intranet.
Quality testing at Guinness
The Student’s t-Test is a significance test, developed by William Sealy Gosset in the 1900s. While working at Guinness, he developed the test as a cheap way of monitoring the quality of the stout.
At the time, Guinness forbade employees from publishing results openly, so it was necessary for Gosset to use a pseudonym for his test.
It’s called the Student’s t-Test, rather than Gosset’s t-Test for this reason.
So what is it, and why do we use it?
The aim of the Student’s t-Test is to compare an experimental value, (determined as the mean of a series of measurements), with a ‘literature value’ taken from a paper, or some value quoted in your laboratory manual. This comparison shows how closely your results compare with published data.
To demonstrate how the Student’s t-Test is used, I’m going to use the same example of ‘activation energy’ discussed in the previous article All About Errors. (See Table 1 below.)
The purpose of significance testing is to calculate the probability that observed differences occur as a result of random fluctuations. In other words, the likelihood of two values being different due to chance alone. So it’s necessary to define this ‘likeliness’ as a probability, called the significance level, α.
Most commonly (and for the sake of simplicity), the significance level is set to be 0.05. α = 0.05 means that there is a 5% probability of the two values being different due to random fluctuations. The significance level is also known as the confidence level (CL), essentially the ‘unlikeliness’ of a difference arising due to only chance.
(Eqn 1)
Table 1 shows the results of 12 measurements of activation energy (Ea) for some reaction. The literature value of the activation energy for this reaction is 38.942 kJ/mol.
Table 1. Sample measurements of Ea (kJ/mol)
Measurement | Ea (kJ/mol) |
1 | 38.542 |
2 | 38.751 |
3 | 39.016 |
4 | 38.768 |
5 | 38.943 |
6 | 38.426 |
7 | 39.124 |
8 | 38.546 |
9 | 38.864 |
10 | 38.687 |
11 | 38.579 |
12 | 38.798 |
Student’s t-Test statistic
3 metrics are required to calculate the t-Test statistic, t:
1. Mean activation energy: x̄ = 38.753 kJ/mol.
2. Standard deviation: s = 0.210 kJ/mol. (Deriving the mean and standard deviation is discussed in All About Errors.)
3. Literature value: μ = 38.942 kJ/mol.
(Eqn 2)
Where, n, is the number of values in the data set, since there are 12 measurements in Table 1, n = 12. This means that for the given data set and literature value, the t-Test statistic is:
(Eqn 3)
The critical value
The value of the t-Test statistic is meaningless without something to compare it to. This is where the significance level comes in; the significance level is used to find the critical value. The critical value can be found from ‘statistical tables’ located at the back of statistics textbooks or online.
The easiest way to find the critical value is to use Microsoft Excel with the formula “=TINV(α,ν)” using the significance level desired (α) with the degrees of freedom (ν). Degrees of freedom (v) is the number of values in a data set (n) minus one. (Note that the symbol for degrees of freedom is the greek letter pronounced nu, not the letter ‘v’.)
The Microsoft Excel formula will return the critical value for the particular α and ν. For α = 0.05 and ν = 11, as is the case in our activation energy example, the critical value is:
An example of how to use the Microsoft Excel formula to find this value is shown in Figure 1.
Figure 1. How to use the function in Microsoft Excel to find the critical value for a given significance level and degrees of freedom.
Comparing the t-Test statistic with critical value
When comparing the calculated t-Test statistic with the critical value, there are two possible outcomes:
- t-Test statistic < critical value
The experimentally determined value is not significantly different from the literature value. The difference between the two values is due to chance.
- t-Test statistic ≥ critical value
The experimentally determined value is significantly different from the literature value. This means that you should consider experimental errors that could give rise to this difference.
So is our difference significant?
As can be seen from Equations 3 and 5 in our example of activation energy, the t-Test statistic is greater than the critical value. This indicates the difference between the experimental mean and literature value is not due to chance, and caused by some experimental or systematic error which should be accounted for. This could be due to poor experimental practice, low precision in equipment, or some other factor.
It should be noted that the Student’s t-Test is just one of a range of significance tests, the use of which depends on the type of “distribution” under study (this is some of the more ‘gory’ detail discussed in the book “Practical Statistics for the Analytical Scientist” mentioned earlier).