Statistics » Range and quartiles

Contents:

Range
Quartiles
Quartile range
Other methods of calculating the quartiles


Range

The difference between the lowest value and the highest value is the range.

Example
3, 5, 5, 6, 7, 8, 9, 10
Range = 10 – 3 = 7

Quartiles

The lower or first quartile (Q1) is the median of the first half of values.
The upper or third quartile (Q3) is the median of the second half of values.

The median itself is actually the second or middle quartile (Q2).
The quartiles divide the dataset in four groups with each 25% of the values.

Example even number of values

1, 2, 5, 6, 7, 8, 9, 11
The median is (6 + 7) : 2 = 6.5.
The lower quartile is the median of the first half:
3, 5, 5, 6 so Q1 = (5 + 5) : 2 = 5.
The upper quartile is the median of the second half:
7, 8, 9, 10 so Q3 = (8 + 9) : 2 = 8.5.

Example odd number of values

With an odd number of values the following problem arises. You can never make four equal groups with 25% of the values. Because of this, over the years, people came up with different methods to calculate the values between the quartiles (called hinges). Mathematicians do not agree which of these is the best method. In schools most of the time Tukey's method (including median) or the method of Moor & McCabe (excluding median) are taught.

Example odd number of values
1, 2, 4, 5, 7, 8, 9, 10, 12, 15, 16, 17, 20
The median is 9.

Method of Tukey

The lower quartile is the median of the first half including the median:
1, 2, 4, 5, 7, 8, 9 so Q1 = 5.
The upper quartile is the median of the second half including the median:
9, 10, 12, 15, 16, 17, 20 so Q3 = 15.

Method of Moore & McCabe

The lower quartile is the median of the first half excluding the median:
1, 2, 4, 5, 7, 8 so Q1 = (4 + 5) : 2 = 4.5.
The upper quartile is the median of the second half excluding the median:
10, 12, 15, 16, 17, 20 so Q3 = (15 + 16) : 2 = 15.5.

Interquartile range

The interquartile range is the difference between the lower and upper quartile.

Example
In the examples above, the interquartile ranges are:
Example even number: 8.5 – 3.5 = 5
Example odd number Tukey: 15 – 5 = 10
Example odd number Moore & McCabe: 15.5 – 4.5 = 11


Other methods of calculating the quartiles

Earlier on this page the methods of Tukey and Moore & McCabe were already mentioned. Tukey came up with the box plot, for this reason this method is widely used. Texas Instruments uses in their graphical calculators Moore & McCabe's method, so that's the reason it is also mentioned above.

As examples, I will use the same datasets as in the examples above.

Method of Mendenhall & Sincich

With this method you will first calculate which values are the hinges.
You use the following formulas (where n is the number of values):
For Q1: 0.25(n + 1)
For Q3: 0.75(n + 1)
If the number is not a whole number you round off to the nearest integer, with one exception: if you get .5 at Q3 you round down.
The use of these formulas also results in different values for Q1 and Q3 with an even number of values.
Hint: If you know that with 0.5(n + 1) you can calculate which value is the median (Q2), you will probably understand better how they got these formulas.

Example even number of values
1, 2, 5, 6, 7, 8, 9, 11
There are 8 values therefore you will get:
For Q1: 0.25(8 + 1) = 2.25 thus the second value
For Q3: 0.75(8 + 1) = 6.75 thus the seventh value
With this method Q1 = 2 and Q3 = 9.

Note: this method makes sure by rounding off that the values of Q1 and Q3 are actual values from the dataset, while the median (Q2 = 6.5) is not ...

Example even number of values
1, 2, 4, 5, 7, 8, 9, 10, 12, 15, 16, 17, 20
There are 13 values therefore you will get:
For Q1: 0.25(13 + 1) = 3.5 thus the fourth value
For Q3: 0.75(13 + 1) = 10.5 thus the tenth value
With this method Q1 = 5 and Q3 = 15.

Method of Minitab

Minitab is computer software with the functionality of Excel but with more possibilities for statistics. This method uses the same formulas as Mendenhall & Sincich, however they do not round off. Instead you use interpolation. This means that with 3.5 you use the value halfway between the third and fourth value and when you have 3.25 you will use the value one fourth of the way between the third and fourth value.

Example even number of values
1, 2, 5, 6, 7, 8, 9, 11
There are 8 values therefore you will get:
For Q1: 0.25(8 + 1) = 2.25
For Q3: 0.75(8 + 1) = 6.75
With this method Q1 and Q3 are:
Q1 = 0.75 × 2 + 0.25 × 5 = 2.75
Q3 = 0.25 × 8 + 0.75 × 9 = 8.75

Example odd number of values
1, 2, 4, 5, 7, 8, 9, 10, 12, 15, 16, 17, 20
There are 13 values therefore you will get:
For Q1: 0.25(13 + 1) = 3.5
For Q3: 0.75(13 + 1) = 10.5
With this method Q1 = 4.5 and Q3 = 15.5.

Method van Freund & Perles / =QUARTILE.INC

This method is used by Excel, when you use the formula '=QUARTILE.INC' or the older '=QUARTILE'.
They use the following formulas.
For Q1: 0.25(n + 3)
For Q3: 0.25(3n + 1)
When the outcome is not a whole number, they also use interpolation.

Example even number of values
1, 2, 5, 6, 7, 8, 9, 11
There are 8 values therefore you will get:
For Q1: 0.25(8 + 3) = 2.75
For Q3: 0.25(3 × 8 + 1) = 6.25
With this method Q1 and Q3 are:
Q1 = 0.25 × 2 + 0.75 × 5 = 4.25
Q3 = 0.75 × 8 + 0.25 × 9 = 8.25

Example odd number of values
1, 2, 4, 5, 7, 8, 9, 10, 12, 15, 16, 17, 20
There are 13 values therefore you will get:
For Q1: 0.25(13 + 1) = 3.5
For Q3: 0.25(3 × 13 + 1) = 10
With this method Q1 = 4.5 and Q3 = 15.

=QUARTILE.EXC

However, you can also choose the formula '=QUARTILE.EXC' in Excel. When using this formula the median is left out when calculating the quartiles. If the median is the mean of two values, both these values are left out.

Example even number of values
1, 2, 5, 6, 7, 8, 9, 11
There are 8 values, but the middle two are considered the median so you are left with 6 values, namely 1, 2, 5, 8, 9, 11
For Q1: 0.25(6 + 3) = 2.25
For Q3: 0.25(3 × 6 + 1) = 4.75
With this method Q1 and Q3 are:
Q1 = 0.75 × 2 + 0.25 × 5 = 2.75
Q3 = 0.25 × 8 + 0.75 × 9 = 8.75

Example odd number of values
1, 2, 4, 5, 7, 8, 9, 10, 12, 15, 16, 17, 20
There are 13 values, but the value in the middle is the median so you are left with 12 values, namely: 1, 2, 4, 5, 7, 8, 10, 12, 15, 16, 17, 20
For Q1: 0.25(12 + 1) = 3.25
For Q3: 0.25(3 × 12 + 1) = 9.25
With this method Q1 and Q3 are:
Q1 = 0.75 × 4 + 0.25 × 5 = 4.25
Q3 = 0.75 × 12 + 0.25 × 15 = 12.75

Comparison of methods using a table

Example even number of values
Example: 1, 2, 5, 6, 7, 8, 9, 11

Q1Q3
Tukey3.58.5
Moore & McCabe3.58.5
Mendenhall and Sincich29
Minitab2.758.75
Freund & Perles /=QUARTILE.INC4.258.25
=QUARTILE.EXC2.758.75

Example odd number of values
Example: 1, 2, 4, 5, 7, 8, 9, 10, 12, 15, 16, 17, 20

Q1Q3
Tukey515
Moore & McCabe4.515.5
Mendenhall and Sincich515
Minitab4.515.5
Freund & Perles /=QUARTILE.INC4.515
=QUARTILE.EXC4.2512.75

You can see that some methods give the same results for an even number of values, but not for an odd number of values and vice versa. Every method therefore gives different results.

What is the best method?

Maybe you find it odd/weird/terrible/annoying there is more than one method to calculate the hinges between the quartiles. Especially when you are in secondary school and this is the first time you are confronted with the fact that there is not just one way to calculate something. In that case I want to you think about these two things:
Firstly: It shows beautifully that the definition of quartiles (four groups of 25%) directly gives an impossible task with an odd number of values. Because of this you will get different opinions and different methods. In the search for better formulas for calculating which value the hinge is, they actually also got different values for the hinges with an even number of values in the dataset. While Tukey's method at first sight looks so logical and neat, especially for an even number of values. Appearances are deceiving?
Secondly: Statisticians most often work with large datasets of >250 values. The values are therefore so close to each other that the different methods all give the same hinges. The whole question is more a problem for the perfectionists and curious sticklers. Statisticians themselves just choose one of the methods according to their preference. To illustrate, if I would have taken the dataset 4, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6 as an example, all methods give Q1 = 4 and Q3 = 6.


To top