### 11.1 Facts About the Chi-Square Distribution

If the number of degrees of freedom for a chi-square distribution is 25, what is the population mean and standard deviation?

If *df* > 90, the distribution is _____________. If *df* = 15, the distribution is ________________.

When does the chi-square curve approximate a normal distribution?

Where is *μ* located on a chi-square curve?

Is it more likely the *df* is 90, 20, or 2 in the graph?

### 11.2 Goodness-of-Fit Test

*Determine the appropriate test to be used in the next three exercises.*

An archeologist is calculating the distribution of the frequency of the number of artifacts she finds in a dig site. Based on previous digs, the archeologist creates an expected distribution broken down by grid sections in the dig site. Once the site has been fully excavated, she compares the actual number of artifacts found in each grid section to see if her expectation was accurate.

An economist is deriving a model to predict outcomes on the stock market. He creates a list of expected points on the stock market index for the next two weeks. At the close of each day’s trading, he records the actual points on the index. He wants to see how well his model matched what actually happened.

A personal trainer is putting together a weight-lifting program for her clients. For a 90-day program, she expects each client to lift a specific maximum weight each week. As she goes along, she records the actual maximum weights her clients lifted. She wants to know how well her expectations met with what was observed.

*Use the following information to answer the next five exercises:* A teacher predicts the distribution of grades on the final exam. The predictions are shown in Table 11.27.

Grade | Proportion |
---|---|

A | 0.25 |

B | 0.30 |

C | 0.35 |

D | 0.10 |

The actual distribution for a class of 20 is in Table 11.28.

Grade | Frequency |
---|---|

A | 7 |

B | 7 |

C | 5 |

D | 1 |

$df=$ ______

State the null and alternative hypotheses.

*χ ^{2}* test statistic = ______

*p*-value = ______

At the 5 percent significance level, what can you conclude?

*Use the following information to answer the next nine exercises:*The cumulative number of cases of a chronic disease reported for Santa Clara County is broken down by ethnicity as in Table 11.29.

Ethnicity | Number of Cases |
---|---|

White | 2,229 |

Hispanic | 1,157 |

Black/African American | 457 |

Asian, Pacific Islander | 232 |

Total = 4,075 |

The percentage of each ethnic group in Santa Clara County is as in Table 11.30.

Ethnicity | % of Total County Population | Number Expected (round to two decimal places) |
---|---|---|

White | 42.9% | 1,748.18 |

Hispanic | 26.7% | |

Black/African American | 2.6% | |

Asian, Pacific Islander | 27.8% | |

Total = 100% |

If the ethnicities of patients followed the ethnicities of the total county population, fill in the expected number of cases per ethnic group.

*Perform a goodness-of-fit test to determine whether the occurrence of disease cases follows the ethnicities of the general population of Santa Clara County.*

*H _{0}*: _______

*H _{a}*: _______

Is this a right-tailed, left-tailed, or two-tailed test?

degrees of freedom = _______

*χ ^{2}* test statistic = _______

*p*-value = _______

Graph the situation. Label and scale the horizontal axis. Mark the mean and test statistic. Shade in the region corresponding to the *p*-value.

Let *α* = 0.05.

Decision: ________________

Reason for the decision: ________________

Conclusion (write out in complete sentences): ________________

Does it appear that the pattern of disease cases in Santa Clara County corresponds to the distribution of ethnic groups in this county? Why or why not?

### 11.3 Test of Independence

*Determine the appropriate test to be used in the next three exercises.*

A pharmaceutical company is interested in the relationship between age and presentation of symptoms for a common viral infection. A random sample is taken of 500 people with the infection across different age groups.

The owner of a baseball team is interested in the relationship between player salaries and team winning percentage. He takes a random sample of 100 players from different organizations.

A marathon runner is interested in the relationship between the brand of shoes runners wear and their run times. She takes a random sample of 50 runners and records their run times and the brand of shoes they were wearing.

*Use the following information to answer the next seven exercises:*Transit Railroads is interested in the relationship between travel distance and the ticket class purchased. A random sample of 200 passengers is taken. Table 11.31 shows the results. The railroad wants to know if a passenger’s choice in ticket class is independent of the distance the passenger must travel.

Traveling Distance | Third Class | Second Class | First Class | Total |
---|---|---|---|---|

1–100 miles |
21 | 14 | 6 | 41 |

101–200 miles |
18 | 16 | 8 | 42 |

201–300 miles |
16 | 17 | 15 | 48 |

301–400 miles |
12 | 14 | 21 | 47 |

401–500 miles |
6 | 6 | 10 | 22 |

Total |
73 | 67 | 60 | 200 |

State the hypotheses.

*H*: _______

_{0}*H*: _______

_{a}*df* = _______

How many passengers are expected to travel between 201 and 300 miles and purchase second-class tickets?

How many passengers are expected to travel between 401 and 500 miles and purchase first-class tickets?

What is the test statistic?

What is the *p*-value?

What can you conclude at the 5 percent level of significance?

*Use the following information to answer the next ten exercises:*An article in the

*New England Journal of Medicine*, discussed a study on people who used a certain product in California and Hawaii. In one part of the report, the self-reported ethnicity and product-use levels per day were given. Of the people using the product at most 10 times per day, there were 9,886 African Americans, 2,745 Native Hawaiians, 12,831 Latinos, 8,378 Japanese Americans, and 7,650 whites. Of the people using the product 11 to 20 times per day, there were 6,514 African Americans, 3,062 Native Hawaiians, 4,932 Latinos, 10,680 Japanese Americans, and 9,877 whites. Of the people using the product 21 to 30 times per day, there were 1,671 African Americans, 1,419 Native Hawaiians, 1,406 Latinos, 4,715 Japanese Americans, and 6,062 whites. Of the people using the product at least 31 times per day, there were 759 African Americans, 788 Native Hawaiians, 800 Latinos, 2,305 Japanese Americans, and 3,970 whites.

Complete the table.

Product use Per Day | African American | Native Hawaiian | Latino | Japanese American | White | TOTALS |
---|---|---|---|---|---|---|

1–10 |
||||||

11–20 |
||||||

21–30 |
||||||

31+ |
||||||

TOTALS |

State the hypotheses.

*H*: _______

_{0}*H*: _______

_{a}Enter expected values in Table 11.32. Round to two decimal places.

Calculate the following values:

*df* = _______

${\chi}^{\text{2}}$ test statistic = ______

*p*-value = ______

Is this a right-tailed, left-tailed, or two-tailed test? Explain why.

Graph the situation. Label and scale the horizontal axis. Mark the mean and test statistic. Shade in the region corresponding to the *p*-value.

State the decision and conclusion (in a complete sentence) for the following preconceived levels of *α:*

*α* = 0.05

- Decision: ___________________
- Reason for the decision: ___________________
- Conclusion (write out in a complete sentence): ___________________

*α* = 0.01

- Decision: ___________________
- Reason for the decision: ___________________
- Conclusion (write out in a complete sentence): ___________________

### 11.4 Test for Homogeneity

A math teacher wants to see if two of her classes have the same distribution of test scores. What test should she use?

A market researcher wants to see if two different stores have the same distribution of sales throughout the year. What type of test should he use?

A meteorologist wants to know if East and West Australia have the same distribution of storms. What type of test should she use?

What condition must be met to use the test for homogeneity?

*Use the following information to answer the next five exercises:* Do private practice doctors and hospital doctors have the same distribution of working hours? Suppose that a sample of 100 private practice doctors and 150 hospital doctors are selected at random and asked about the number of hours a week they work. The results are shown in Table 11.33.

20–30 | 30–40 | 40–50 | 50–60 | |
---|---|---|---|---|

Private Practice |
16 | 40 | 38 | 6 |

Hospital |
8 | 44 | 59 | 39 |

State the null and alternative hypotheses.

*df* = _______

What is the test statistic?

What is the *p*-value?

What can you conclude at the 5 percent significance level?

### 11.5 Comparison of the Chi-Square Tests

Which test do you use to decide whether an observed distribution is the same as an expected distribution?

Which test would you use to decide whether two factors have a relationship?

Which test would you use to decide if two populations have the same distribution?

How are tests of independence similar to tests for homogeneity?

How are tests of independence different from tests for homogeneity?

### 11.6 Test of a Single Variance

*Use the following information to answer the next three exercises:* An archer’s standard deviation for his hits is six, where the data are measured in distance from the center of the target. An observer claims the standard deviation is less than six.

What type of test should be used?

State the null and alternative hypotheses.

Is this a right-tailed, left-tailed, or two-tailed test?

*Use the following information to answer the next three exercises:*The standard deviation of heights for students in a school is 0.81. A random sample of 50 students is taken, and the standard deviation of heights of the sample is 0.96. A researcher in charge of the study believes the standard deviation of heights for the school is greater than 0.81.

What type of test should be used?

State the null and alternative hypotheses.

*df* = ________

*Use the following information to answer the next four exercises:*The average waiting time in a doctor’s office varies. The standard deviation of waiting times in a doctor’s office is 3.4 minutes. A random sample of 30 patients in the doctor’s office has a standard deviation of waiting times of 4.1 minutes. One doctor believes the variance of waiting times is greater than originally thought.

What type of test should be used?

What is the test statistic?

What is the *p*-value?

What can you conclude at the 5 percent significance level?