Question: 

Researchers were interested in looking at air quality in different regions of the United States. The following dotplot represents the number of days in a certain month that the air quality was unhealthy for eighteen cities in the Midwest.

Which of the following is the boxplot for these data?

(A)

(B)

(C)

(D)

Level: 
Intermediate

Standards

6.SP.4: Display numerical data in plots on a number line, including dot plots, histograms, and box plots.

S-ID.1: Represent data with plots on the real number line (dot plots, histograms, and box plots).

Correct answer and commentary

The correct answer to this item is Option (C). A boxplot is a visual representation of the five-number summary of a dataset: minimum, first quartile (Q1), median (second quartile, Q2), third quartile (Q3), and maximum. In a traditional boxplot, the whiskers extend to the minimum and maximum, and the three vertical lines of the box represent the first quartile, median, and third quartile. In order to determine which boxplot is the correct representation of the data provided in the dotplot, we use the dotplot to calculate values from the five-number summary. First, the median is the average of the 9th and 10th values in this data set since there are 18 data points. Thus, the median is 2.5. The only two answer choices with this median are Option (C) and Option (D), which are distinguished by the location of the third quartile. The third quartile in this data set is the same as the median of the upper 9 data points, or the 15th value in the data set (ordered from lowest to highest). Thus, the third quartile is 7, making Option (C) the correct choice.

The boxplot in Option (D) has the correct median; however, it displays the incorrect value for the third quartile. Students who elect this choice may have a solid understanding of the median, but lack an understanding of quartiles. The boxplot in Option (A) is simply centered halfway between the minimum and maximum and does not contain any correct values for the median or the quartiles. The boxplot in Option (B) displays the correct quartiles, but an incorrect median. This boxplot has similar visual characteristics as the dotplot, with most of the data clustered at the lower values. However, it can sometimes be difficult to visually estimate the median of a data set, which can lead to inaccuracies.

The use of various graphical displays to analyze data and communicate statistical results is an important statistical skill. In particular, it is important for students to recognize that multiple displays may be appropriate for a single dataset and that different displays highlight different features of the raw data.

Student performance