## Seeing Through Statistics 4th Edition by Utts – Test Bank

**CHAPTER 7**

**Summarizing AND displaying measurement data**

**SECTION 7.1**

**Turning data into information**

**FREE RESPONSE QUESTIONS**

- Name the four kinds of useful information that you can get about a set of measurement data once it has been organized and summarized.

**Answer: 1) the center; 2) the variability; 3) the shape; and 4) unusual values (outliers).**

__For Questions 2-3, use the following narrative__

Narrative: Quiz scores

Bob has taken 6 quizzes so far in his statistics class. Each quiz has a possible of 10 total points. Bob’s scores are the following: 10, 8, 9, 7, 2, and 9.

- {Quiz scores narrative} Find the three measures of center for Bob’s quiz scores.

**Answer: The mean (average) is 7.5; the mode is 9; the median is 8.5.**

- {Quiz scores narrative} Explain how (if) Bob’s lowest quiz score affects the mean and the median of this data set.

**Answer: the 2 brings down the mean (average); it does not affect the median.**

- Name two questions that can be answered by determining the shape of a data set.

**Answer: any reasonable answers ok. examples: 1) are most of the values clumped in the middle? are there two distinct groupings? how many high and low values are there, compared to the number of values in the middle?**

**Multiple Choice QUESTIONS**

- Which of the following measures of center is affected by an outlier?
- Mean
- Median
- Mode
- All of the above

**Answer: a**

- The mode is most meaningful for which type of data?
- Measurement data
- Categorical data
- Biased data
- None of the above

**Answer: b**

- The amount of spread in the data is a measure of what characteristic of a data set?
- Center
- Variability
- Shape
- None of the above

**Answer: b**

- What is the simplest measure of variability in a data set?
- The interquartile spread
- The outliers
- The range
- The standard deviation

**Answer: c**

**fILL-in-the-blank QUESTIONS**

- The __________ is a measure of center with half of the scores falling at or above it and half of the scores falling at or below it.

**Answer: median**

- One or two scores that are far removed from the rest of the data are called __________.

**Answer: outliers**

**SECTION 7.2**

**Picturing data: stemplots and histograms**

__For Questions 11-14, use the following narrative__

Narrative: School costs

Suppose a random sample of liberal arts schools was taken, and the average cost per student was measured for each school. The data are pictured in the histogram below:

**FREE RESPONSE QUESTIONS**

- {School cost narrative} What is the shape of this data set?

**Answer: Skewed to the right. Data also has an outlier on the high end.**

- {School cost narrative} How many schools were sampled in this study?

**Answer: 25**

- {School cost narrative} Two measures of center were calculated for this data set and were found to be $26,668, and $34,832. One of them is the mean, and the other is the median. Which one is which, and how do you know that?

**Answer: $26,668 is the median and $34,832 is the mean. WHy: the data is skewed to the right and has an outlier, both of which drive up the mean and leave the median unaffected.**

- {School cost narrative} Describe the four important characteristics of this data set (for example, the shape) using words that a parent exploring the cost of liberal arts colleges would find useful.

**Answer: Center: The average cost is $34,832 per student per year, and the median cost is $26,668 per student per year; Variability: most of the schools cost between $15,000 and $75,000 per year per student; Shape: The data are skewed to the right, meaning there is more variability in cost for schools that are pricier; outliers: One school is an outlier, costing over $100,000 per year per student.**

**Multiple Choice QUESTIONS**

- Which of the following pictures of a data set allows you to retrieve the actual data (assuming no digits are dropped)?
- A histogram
- A stemplot
- Both a) and b)
- Neither a) nor b)

**Answer: b**

- If the bars of a histogram represent the proportion of the total count that falls into each interval, what must the heights of the bars sum to?
- The total number of numbers in the data set.
- One.
- 1 divided by the total number of intervals used in the histogram.
- Not enough information to tell.

**Answer: b**

- Which of the following statements is true?
- If a data set is skewed to the right, that means there is bias in the results; the data are higher than they should be.
- If a data set is skewed to the right, then the higher values are more spread out than the lower values.
- If a data set is skewed to the right, then the lower values are more spread out than the higher values.
- None of the above.

**Answer: B**

- Which of the following statements regarding stemplots is
__false__?- A stemplot allows you to retrieve the original data (assuming no digits are dropped).
- A stemplot can never reuse the same stem digit twice.
- If a certain value in your data set is repeated three times, it must appear 3 times in appropriate stem and leaf of the stemplot.
- None of the above statements are false.

**Answer: b**

**fILL-in-the-blank QUESTIONS**

- A __________ is a quick and easy way to put a list of numbers into order while getting a picture of their shape.

**Answer: stemplot**

- A data set is __________ if the two halves of the data set (when cut down the middle) are mirror images of each other.

**Answer: symmetric**

**SECTION 7.3**

**five useful numbers: a summary**

**FREE RESPONSE QUESTIONS**

- What are the five numbers used in a five-number summary?

**Answer: The lowest (minimum); the highest (maximum); the median; the lower quartile; and the upper QUARTILE.**

__For Questions 22-24, use the following narrative__

Narrative: Acceptance rates

A random sample of 50 colleges and universities in the U.S. was selected, and acceptance rates were recorded for each school (percentage of student applicants who were accepted to the school). The following five-number summary was calculated for this data set: lowest = 17; lower quartile = 25.75; median = 36; upper quartile = 47.75; highest = 67.

- {Acceptance rates narrative} Describe the shape of this data based on the five-number summary.

**Answer: The data set looks fairly evenly distributed except for the last quarter of the data, which is more spread out than the rest, indicating SKEWNESS to the right.**

- {Acceptance rates narrative} Describe the center of this data set in words a prospective student would understand.

**Answer: The median (middle) of the acceptance rates is 36%. about half of the schools accept 36% or more of applicants, and about half accept 36% or less.**

- {Acceptance rates narrative} Find the range of the acceptance rates and give one possible reason that it is such a high number.

**Answer: Range = 50; explanation: any reasonable answer ok. Examples: an outlier would affect the range; there are many variables affecting acceptance rates, including type of institution (liberal arts vs. four-year university, etc); region of the u.s.; quality of education; etc.**

**Multiple Choice QUESTIONS**

- Suppose that in a five-number summary you find that a larger gap exists between the third quartile and the highest value than between the lowest value and the first quartile. What does this mean about the shape of the data set?
- Symmetric
- Skewed right
- Skewed left
- Not enough information to tell.

**Answer: b**

- Suppose that in a five-number summary you find that a larger gap exists between the extremes and the quartiles than between the quartiles and the median. What does this mean about the shape of the data set?
- The data are clumped at the high and low ends.
- The data are clumped in the middle.
- The data are not symmetric.
- Not enough information to tell.

**Answer: b**

- Which of the following does
__not__require the data to be ordered before you can get the right answer?- Mean
- Median
- Quartiles
- Range
- All of the above require the data to be ordered.

**Answer: a**

- Which of the following is
__not__included in the five-number summary?- Mean
- Median
- Lower quartile
- Highest number
- All of the above are included in the five-number summary.

**Answer: a**

**fILL-in-the-blank QUESTIONS**

- The __________ are the medians of the two halves of an ordered data set.

**Answer: quartiles**

- A five-number summary involves the lower quartile, the upper quartile, the lowest number, the highest number, and the __________.

**Answer: median**

**SECTION 7.4**

**Boxplots**

**FREE RESPONSE QUESTIONS**

- Describe which numerical summaries (statistics) of a dataset a boxplot is based upon.

**Answer: the five-number summary (lowest number, lower quartile, median, upper quartile, and highest number).**

- Name two uses of a boxplot.

**Answer: 1) it is a visually APPEALING and useful way to present a five-number summary of a dataset; 2) it allows for easy comparison of the center and spread of data collected from two or more groups.**

__For Questions 33-34, use the following narrative__

Narrative: Liberal arts costs

A random sample of 25 liberal arts colleges in the U.S. was selected, and the average cost per student was recorded for each school. The following five-number summary was calculated for this data set: lowest = $17,554; lower quartile = $23,115; median = $26,668; upper quartile = $45,879; highest = $102,262.

- {Liberal arts costs narrative} Make a boxplot of this data set and use it to discuss the shape of the data.

**Answer: Boxplot should have a box around the upper and lower quartiles with a line down the middle for the median, and lines going out to the lowest and highest values, connecting with the box, all in the proper scale. Shape: skewed right.**

- {Liberal arts costs narrative} Using the definition of
*outlier*discussed in your textbook, is there an outlier in this data set? Explain your answer.

**Answer: Yes. The highest amount, $102,262 is more than 1.5 times the IQR away from the upper end of the box. **

**Multiple Choice QUESTIONS**

- How do you calculate the interquartile range for a data set?
- Take the highest value minus the lowest value and divide it by four.
- Subtract the value of the lower quartile from the upper quartile.
- Subtract the value of the upper quartile from the lower quartile.
- Divide the data set into four equal parts and find the range between each of the resulting quarters.

**Answer: b**

- If the width of a box in a boxplot is very large, compared to the rest of the boxplot, what does that mean about the shape of the data set?
- The data are very spread out in the middle.
- The data are clumped tightly in the middle.
- The data are not symmetric.
- Not enough information to tell.

**Answer: a**

- Suppose you look at two boxplots comparing the weights of male cats vs. female cats, and you find that the box for the males is much wider than the box for the females. What does this mean about the data sets?
- Male cats weigh more than female cats overall.
- Male cats have more variability in their weights than female cats.
- Weights of male cats are more skewed than for female cats.
- None of the above.

**Answer: b**

- Which of the following can
__not__be obtained from a boxplot?- Mean
- Median
- IQR
- Range
- All of the above can be obtained from a boxplot.

**Answer: a**

**fILL-in-the-blank QUESTIONS**

- The __________ is the distance between the lower and upper quartiles in a boxplot.

**Answer: interquartile range**

- A(n) __________ is any value that is more than 1.5 times the IQR from the closest end of the box in a boxplot.

**Answer: outlier**

**SECTION 7.5**

**traditional measures: mean, variance, and standard deviation**

**FREE RESPONSE QUESTIONS**

__For Questions 41-42, use the following narrative__

Narrative: Create data

Suppose you can create your own data set by choosing from the numbers 1, 2, 3, 4, and 5. You can repeat a number as many times as you wish, as long as your final data set contains four numbers in it. Here are two examples of data sets you could create: {1, 2, 3, 4} or {1, 1, 5, 5}.

- {Create data narrative} Create a data set that has the lowest possible standard deviation.

**Answer: any data set containing the same 4 numbers is acceptable. Example: 1, 1, 1, 1.**

- {Create data narrative} Create a data set that has mean 3 and standard deviation 0.

**Answer: 3, 3, 3, 3.**

- Which of the following two data sets has the larger standard deviation: Data Set A= {1, 1, 5, 5} or Data Set B= {1, 3, 3, 5}?

**Answer: Data set a**

- Explain what is meant by the standard deviation in terms that a non-statistician would understand.

**Answer: the average distance of the observed values from their mean (roughly). **

**Multiple Choice QUESTIONS**

- What is the relationship between the variance and the standard deviation?
- The variance is the square root of the standard deviation.
- The variance is the square of the standard deviation.
- The variance is twice the standard deviation.
- There is no relationship between them.

**Answer: b**

- Suppose a data set is skewed left. What is the most likely relationship between the mean and the median?
- The mean is larger than the median.
- The mean is smaller than the median.
- The mean and the median are not related to each other at all.
- The mean and the median are essentially equal.

**Answer: b**

- Which of the following statements is false?
- If the standard deviation is positive, the mean must be positive.
- The standard deviation can be negative.
- If the mean is large, the standard deviation will be large also.
- All of the above are false.

**Answer: d**

- Which of the following is
__not__a measure of spread or variability in a data set?- Standard deviation
- IQR
- Range
- All of the above are measures of spread or variability in a data set.

**Answer: d**

**fILL-in-the-blank QUESTIONS**

- Because the __________ can be distorted by high outliers, the center of a data set involving incomes or prices is usually summarized using the __________.

**Answers (respectively): mean, median**

- If the shape of a data set is __________ then the mean and the median should be about equal.

**Answer: symmetric**

**SECTION 7.6**

**caution: being average isn’t normal**

**FREE RESPONSE QUESTIONS**

- What is wrong with the following statement: “Today’s temperature of 101 in Los Angeles was a record high for October, a whopping 17 degrees above normal for this date.”

**Answer: THe word ‘normal’ is being confused with the word ‘average.’ **

- If you just use an average to describe a set of measurements, is this enough? Explain your answer.

**Answer: no; you also need to address the amount of variability in the data.**

**Multiple Choice QUESTIONS**

- Which of the following statements is statistically correct?
- “Jimmy is taller than normal for a two-year old.”
- “Jimmy is taller than the average two-year old.”
- “Jimmy is taller than the average height of two-year olds.”
- All of the above are statistically correct.

**Answer: c**

- Which of the following methods is the most appropriate one for ‘proving’ someone cheated on a multiple choice exam who was allegedly looking at someone else’s paper?
- Examine the two papers and see how many questions they both got wrong, and how many times the same wrong answer was chosen for those questions.
- Take the student’s paper that was allegedly copied from (student X) and compare it to all other students’ papers in the class. Take the number of answers that each student matched with student X and make a histogram. Then see where the alleged cheater fell on the resulting histogram.
- Neither of these methods is appropriate. There is always a chance that two people could have the same answers but no cheating was going on.
- Both of these methods are equivalent, so either one is appropriate.

**Answer: b**

## Reviews

There are no reviews yet.