Introduction

Introduction

Box plots, also called box-and-whisker plots or box-whisker plots, give a good graphical image of the concentration of the data. They also show how far the extreme values are from most of the data. As mentioned previously, a box plot is constructed from five values: the minimum value, the first quartile, the median, the third quartile, and the maximum value. We use these values to compare how close other data values are to them.

To construct a box plot, use a horizontal or vertical number line and a rectangular box. The smallest and largest data values label the endpoints of the axis. The first quartile marks one end of the box, and the third quartile marks the other end of the box. Approximately the middle 50 percent of the data fall inside the box. The whiskers extend from the ends of the box to the smallest and largest data values. A box plot easily shows the range of a data set, which is the difference between the largest and smallest data values (or the difference between the maximum and minimum). Unless the median, first quartile, and third quartile are the same value, the median will lie inside the box or between the first and third quartiles. The box plot gives a good, quick picture of the data.

NOTE

You may encounter box-and-whisker plots that have dots marking outlier values. In those cases, the whiskers are not extending to the minimum and maximum values.

Consider, again, the following data set:

1, 1, 2, 2, 4, 6, 6.8, 7.2, 8, 8.3, 9, 10, 10, 11.5

The first quartile is two, the median is seven, and the third quartile is nine. The smallest value is one, and the largest value is 11.5. The following image shows the constructed box plot.

NOTE

See the calculator instructions on the TI web site or in the appendix.

Horizontal boxplot's first whisker extends from the smallest value, 1, to the first quartile, 2, the box begins at the first quartile and extends to the third quartile, 9, a vertical dashed line is drawn at the median, 7, and the second whisker extends from the third quartile to the largest value of 11.5.
Figure 2.13

The two whiskers extend from the first quartile to the smallest value and from the third quartile to the largest value. The median is shown with a dashed line.

NOTE

It is important to start a box plot with a scaled number line. Otherwise, the box plot may not be useful.

Example 2.24

The following data are the heights of 40 students in a statistics class:

59, 60, 61, 62, 62, 63, 63, 64, 64, 64, 65, 65, 65, 65, 65, 65, 65, 65, 65, 66, 66, 67, 67, 68, 68, 69, 70, 70, 70, 70, 70, 71, 71, 72, 72, 73, 74, 74, 75, 77

Construct a box plot with the following properties. Calculator instructions for finding the five number summary follow this example:

  • Minimum value = 59
  • Maximum value = 77
  • Q1: First quartile = 64.5
  • Q2: Second quartile or median = 66
  • Q3: Third quartile = 70
Horizontal boxplot with first whisker extending from smallest value, 59, to Q1, 64.5, box beginning from Q1 to Q3, 70, median dashed line at Q2, 66, and second whisker extending from Q3 to largest value, 77.
Figure 2.14
  1. Each quarter has approximately 25 percent of the data.
  2. The spreads of the four quarters are 64.5 – 59 = 5.5 (first quarter), 66 – 64.5 = 1.5 (second quarter), 70 – 66 = 4 (third quarter), and 77 – 70 = 7 (fourth quarter). So, the second quarter has the smallest spread, and the fourth quarter has the largest spread.
  3. Range = maximum value – minimum value = 77 – 59 = 18.
  4. Interquartile Range: IQR = Q3 – Q1 = 70 – 64.5 = 5.5.
  5. The interval 59–65 has more than 25 percent of the data, so it has more data in it than the interval 66–70, which has 25 percent of the data.
  6. The middle 50 percent (middle half) of the data has a range of 5.5 inches.

Using the TI-83, 83+, 84, 84+ Calculator

To find the minimum, maximum, and quartiles:

Enter data into the list editor (Pres STAT 1:EDIT). If you need to clear the list, arrow up to the name L1, press CLEAR, and then arrow down.

Put the data values into the list L1.

Press STAT and arrow to CALC. Press 1:1-VarStats. Enter L1.

Press ENTER.

Use the down and up arrow keys to scroll.

Smallest value = 59.

Largest value = 77.

Q1: First quartile = 64.5.

Q2: Second quartile or median = 66.

Q3: Third quartile = 70.

To construct the box plot:

Press 4:Plotsoff. Press ENTER.

Arrow down and then use the right arrow key to go to the fifth picture, which is the box plot. Press ENTER.

Arrow down to Xlist: Press 2nd 1 for L1.

Arrow down to Freq: Press ALPHA. Press 1.

Press Zoom. Press 9: ZoomStat.

Press TRACE and use the arrow keys to examine the box plot.

Try It 2.24

The following data are the number of pages in 40 books on a shelf. Construct a box plot using a graphing calculator and state the interquartile range:

136, 140, 178, 190, 205, 215, 217, 218, 232, 234, 240, 255, 270, 275, 290, 301, 303, 315, 317, 318, 326, 333, 343, 349, 360, 369, 377, 388, 391, 392, 398, 400, 402, 405, 408, 422, 429, 450, 475, 512

For some sets of data, some of the largest value, smallest value, first quartile, median, and third quartile may be the same. For instance, you might have a data set in which the median and the third quartile are the same. In this case, the diagram would not have a dotted line inside the box displaying the median. The right side of the box would display both the third quartile and the median. For example, if the smallest value and the first quartile were both one, the median and the third quartile were both five, and the largest value was seven, the box plot would look like the following:

Horizontal boxplot box begins at the smallest value and Q1, 1, until the Q3 and median, 5, no median line is designated, and has its lone whisker extending from the Q3 to the largest value, 7.
Figure 2.15

In this case, at least 25 percent of the values are equal to one. Twenty-five percent of the values are between one and five, inclusive. At least 25 percent of the values are equal to five. The top 25 percent of the values fall between five and seven, inclusive.

Example 2.25

Test scores for Mr. Ramirez's class held during the day are as follows:

99, 56, 78, 55.5, 32, 90, 80, 81, 56, 59, 45, 77, 84.5, 84, 70, 72, 68, 32, 79, 90

Test scores for Ms. Park's class held during the evening are as follows:

98, 78, 68, 83, 81, 89, 88, 76, 65, 45, 98, 90, 80, 84.5, 85, 79, 78, 98, 90, 79, 81, 25.5

  1. Find the smallest and largest values, the median, and the first and third quartile for Mr. Ramirez's class.
  2. Find the smallest and largest values, the median, and the first and third quartile for Ms. Park's class.
  3. For each data set, what percentage of the data is between the smallest value and the first quartile? the first quartile and the median? the median and the third quartile? the third quartile and the largest value? What percentage of the data is between the first quartile and the largest value?
  4. Create a box plot for each set of data. Use one number line for both box plots.
  5. Which box plot has the widest spread for the middle 50 percent of the data—the data between the first and third quartiles? What does this mean for that set of data in comparison to the other set of data?
Solution 2.25
    • Min = 32
    • Q1 = 56
    • M = 74.5
    • Q3 = 82.5
    • Max = 99
    • Min = 25.5
    • Q1 = 78
    • M = 81
    • Q3 = 89
    • Max = 98
  1. Mr. Ramirez's class: There are six data values ranging from 32 to 56: 30 percent. There are six data values ranging from 56 to 74.5: 30 percent. There are five data values ranging from 74.5 to 82.5: 25 percent. There are five data values ranging from 82.5 to 99: 25 percent. There are 16 data values between the first quartile, 56, and the largest value, 99: 75 percent. Ms. Park’s class: There are six data values ranging from 25.5 to 78: 27 percent. There are five data values ranging from 78 to the first instance of 81: 23 percent. There are six data values ranging from the second instance of 81 to 89: 27 percent. There are five data values ranging from 90 to 98: 23 percent. There are 17 values between the first quartile, 78, and the largest value, 98: 77 percent.
  2. Two box plots over a number line from 0 to 100.  The top plot shows a whisker from 32 to 56, a solid line at 56, a dashed line at 74.5, a solid line at 82.5, and a whisker from 82.5 to 99.  The lower plot shows a whisker from 25.5 to 78, solid line at 78, dashed line at 81, solid line at 89, and a whisker from 89 to 98.
    Figure 2.16
  3. The first data set has the wider spread for the middle 50 percent of the data. The IQR for the first data set is greater than the IQR for the second set. This means that there is more variability in the middle 50 percent of the first data set.
Try It 2.25

The following data set shows the heights in inches for the boys in a class of 40 students:

66, 66, 67, 67, 68, 68, 68, 68, 68, 69, 69, 69, 70, 71, 72, 72, 72, 73, 73, 74

The following data set shows the heights in inches for the girls in a class of 40 students:

61, 61, 62, 62, 63, 63, 63, 65, 65, 65, 66, 66, 66, 67, 68, 68, 68, 69, 69, 69

Construct a box plot using a graphing calculator for each data set, and state which box plot has the wider spread for the middle 50 percent of the data.

Example 2.26

Graph a box-and-whisker plot for the following data values shown:

10, 10, 10, 15, 35, 75, 90, 95, 100, 175, 420, 490, 515, 515, 790

The five numbers used to create a box-and-whisker plot are as follows:

  • Min: 10
  • Q1: 15
  • Med: 95
  • Q3: 490
  • Max: 790

The following graph shows the box-and-whisker plot:

This graph and whisker plot shows the numbers 10, 15, 95, 490, and 790 represented on a line plot. The markings are from 10 to 790 with the range being from 15 to 490 and the median being 95.
Figure 2.17
 
Try It 2.26

Follow the steps you used to graph a box-and-whisker plot for the data values shown:

0, 5, 5, 15, 30, 30, 45, 50, 50, 60, 75, 110, 140, 240, 330