Introduction
 The two independent samples are simple random samples from two distinct populations.
 For the two distinct populations
 if the sample sizes are small, the distributions are important (should be normal), and
 if the sample sizes are large, the distributions are not important (need not be normal).
The comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Very different means can occur by chance if there is great variation among the individual samples. To account for the variation, we take the difference of the sample means, ${\overline{X}}_{1}{\overline{X}}_{2}$, and divide by the standard error to standardize the difference. The result is a tscore test statistic.
Because we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For the hypothesis test, we calculate the estimated standard deviation, or standard error, of the difference in sample means, ${\overline{X}}_{1}{\overline{X}}_{2}\text{.}$
The standard error is calculated as follows:
The test statistic (tscore) is calculated as follows:
 s_{1} and s_{2}, the sample standard deviations, are estimates of σ_{1} and σ_{2}, respectively,
 σ_{1} and σ_{1} are the unknown population standard deviations,
 ${\overline{x}}_{1}$ and ${\overline{x}}_{2}$ are the sample means, and
 μ_{1} and μ_{2} are the population means.
The number of degrees of freedom (df) requires a somewhat complicated calculation. However, a computer or calculator calculates it easily. The df are not always a whole number. The test statistic calculated previously is approximated by the Student’s tdistribution with df as follows:
Degrees of freedom
When both sample sizes n_{1} and n_{2} are five or larger, the Student’s t approximation is very good. Notice that the sample variances (s_{1})^{2} and (s_{2})^{2} are not pooled. (If the question comes up, do not pool the variances.)
Example 10.1 Independent groups
The average amount of time boys and girls aged 7 to 11 spend playing sports each day is believed to be the same. A study is done and data are collected, resulting in the data in Table 10.1. Each populations has a normal distribution.
Sample Size  Average Number of Hours Playing Sports per Day  Sample Standard Deviation  

Girls  9  2  0.866 
Boys  16  3.2  1.00 
Is there a difference in the mean amount of time boys and girls aged 7 to 11 play sports each day? Test at the 5 percent level of significance.
The population standard deviations are not known. Let g be the subscript for girls and b be the subscript for boys. Then, μ_{g} is the population mean for girls and μ_{b} is the population mean for boys. This is a test of two independent groups, two population means.
Random variable: {\overline{X}}_{g}{\overline{X}}_{b} = difference in the sample mean amount of time girls and boys play sports each day.
Distribution for the test: Use t_{df} where df is calculated using the df formula for independent groups, two population means. Using a calculator, df is approximately 18.8462. Do not pool the variances.
Calculate the pvalue using a Student’s tdistribution: pvalue = 0.0054
Graph:
Make a decision: Since α > pvalue, reject H_{0}. This means you reject μ_{g} = μ_{b}. The means are different.
Using the TI83, 83+, 84, 84+ Calculator
Press STAT
. Arrow over to TESTS
and press 4:2SampTTest
. Arrow over to Stats
and press ENTER
. Arrow down and enter 2
for the first sample mean, 0.866
for Sx1, 9
for n1, 3.2
for the second sample mean, 1
for Sx2, and 16
for n2. Arrow down to μ1:
and arrow to does not equal
μ2. Press ENTER
. Arrow down to Pooled
: and No
. Press ENTER
. Arrow down to Calculate
and press ENTER
. The pvalue is p = 0.0054, the dfs are approximately 18.8462, and the test statistic is –3.14. Do the procedure again, but instead of Calculate
do Draw
.
Conclusion: At the 5 percent level of significance, the sample data show there is sufficient evidence to conclude that the mean number of hours that girls and boys aged 7 to 11 play sports per day is different (mean number of hours boys aged 7 to 11 play sports per day is greater than the mean number of hours played by girls OR the mean number of hours girls aged 7 to 11 play sports per day is greater than the mean number of hours played by boys).
Two samples are shown in Table 10.2. Both have normal distributions. The means for the two populations are thought to be the same. Is there a difference in the means? Test at the 5 percent level of significance.
Sample Size  Sample Mean  Sample Standard Deviation  

Population A  25  5  1 
Population B  16  4.7  1.2 
NOTE
When the sum of the sample sizes is larger than 30 (n_{1} + n_{2} > 30), you can use the normal distribution to approximate the Student’s t.
Example 10.2
A study is done by a community group in two neighboring colleges to determine which one graduates students with more math classes. College A samples 11 graduates. Their average is 4 math classes with a standard deviation of 1.5 math classes. College B samples nine graduates. Their average is 3.5 math classes with a standard deviation of 1 math class. The community group believes that a student who graduates from College A has taken more math classes, on average. Both populations have a normal distribution. Test at a 1 percent significance level. Answer the following questions:
a. Is this a test of two means or two proportions?
a. two means
b. Are the populations standard deviations known or unknown?
b. unknown
c. Which distribution do you use to perform the test?
c. Student’s t
d. What is the random variable?
d. ${\overline{X}}_{A}{\overline{X}}_{B}$
e. What are the null and alternate hypotheses? Write the null and alternate hypotheses in symbols.
e.
${H}_{o}:{\mu}_{A}\le {\mu}_{B}$
${H}_{a}:{\mu}_{A}>{\mu}_{B}$
f. Is this test right, left, or twotailed?
f.
g. What is the pvalue?
g. 0.1928
h. Do you reject or not reject the null hypothesis?
h. do not reject
i. Conclusion:
i. At the 1 percent level of significance, from the sample data, there is not sufficient evidence to conclude that a student who graduates from College A has taken more math classes, on average, than a student who graduates from College B.
A study is done to determine if Company A retains its workers longer than Company B. Company A samples 15 workers, and their average time with the company is 5 years with a standard deviation of 1.2. Company B samples 20 workers, and their average time with the company is 4.5 years with a standard deviation of 0.8. The populations are normally distributed.
 Are the population standard deviations known?
 Conduct an appropriate hypothesis test. At the 5 percent significance level, what is your conclusion?
Example 10.3
A professor at a large community college wanted to determine whether there is a difference in the means of final exam scores between students who took his statistics course online and the students who took his facetoface statistics class. He believed that the mean of the final exam scores for the online class would be lower than that of the facetoface class. Was the professor correct? The randomly selected 30 final exam scores from each group are listed in Table 10.3 and Table 10.4.
67.6  41.2  85.3  55.9  82.4  91.2  73.5  94.1  64.7  64.7 
70.6  38.2  61.8  88.2  70.6  58.8  91.2  73.5  82.4  35.5 
94.1  88.2  64.7  55.9  88.2  97.1  85.3  61.8  79.4  79.4 
77.9  95.3  81.2  74.1  98.8  88.2  85.9  92.9  87.1  88.2 
69.4  57.6  69.4  67.1  97.6  85.9  88.2  91.8  78.8  71.8 
98.8  61.2  92.9  90.6  97.6  100  95.3  83.5  92.9  89.4 
Is the mean of the final exam scores of the online class lower than the mean of the final exam scores of the facetoface class? Test at a 5 percent significance level. Answer the following questions:
 Is this a test of two means or two proportions?
 Are the population standard deviations known or unknown?
 Which distribution do you use to perform the test?
 What is the random variable?
 What are the null and alternative hypotheses? Write the null and alternative hypotheses in words and in symbols.
 Is this test right, left, or twotailed?
 What is the pvalue?
 Do you reject or not reject the null hypothesis?
 At the ______ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ______.
(See the conclusion in Example 10.2, and write yours in a similar fashion.)
Using the TI83, 83+, 84, 84+ Calculator
First put the data for each group into two lists (such as L1 and L2). Press STAT
. Arrow over to TESTS
and press 4:2
. Make sure SampTTestData
is highlighted and p ress ENTER
. Arrow do wn and enter L1
for the first list and L2
for the second list. Arrow down to μ_{1}
: and arrow to ≠ μ_{2}
(does not equal). Press ENTER
. Arrow down to Pooled: No.
Press ENTER
. Arrow down to Calculate
and press ENTER
.
Note
Be careful not to mix up the information for Group 1 and Group 2!
 two means
 unknown
 Student’s t
 ${\overline{X}}_{1}\u2013{\overline{X}}_{2}$

 H_{0}: μ_{1} = μ_{2} Null hypothesis: The means of the final exam scores are equal for the online and facetoface statistics classes.
 H_{a}: μ_{1} < μ_{2} Alternative hypothesis: The mean of the final exam scores of the online class is less than the mean of the final exam scores of the facetoface class.
 lefttailed
 pvalue = 0.0011
 Reject the null hypothesis
 The professor was correct. The evidence shows that the mean of the final exam scores for the online class is lower than that of the facetoface class.
At the 5 percent level of significance, from the sample data, there is (is/is not) sufficient evidence to conclude that the mean of the final exam scores for the online class is less than the mean of final exam scores of the facetoface class.
Cohen’s Standards for Small, Medium, and Large Effect SizesCohen’s d is a measure of effect size based on the differences between two means. Cohen’s d, named for U.S. statistician Jacob Cohen, measures the relative strength of the differences between the means of two populations based on sample data. The calculated value of effect size is then compared to Cohen’s standards of small, medium, and large effect sizes.
Size of Effect  d 

Small  0.2 
Medium  0.5 
Large  0.8 
Cohen’s d is the measure of the difference between two means divided by the pooled standard deviation: $d=\frac{{\overline{x}}_{1}\u2013{\overline{x}}_{2}}{{s}_{pooled}}$ where ${s}_{pooled}=\sqrt{\frac{({n}_{1}\u20131){s}_{1}^{2}+({n}_{2}\u20131){s}_{2}^{2}}{{n}_{1}+{n}_{2}\u20132}}$
Example 10.4
Calculate Cohen’s d for Example 10.2. Is the size of the effect small, medium, or large? Explain what the size of the effect means for this problem.
μ_{1} = 4 s_{1} = 1.5 n_{1} = 11
μ_{2} = 3.5 s_{2} = 1 n_{2} = 9
d = 0.384
Example 10.5
Calculate Cohen’s d for Example 10.3. Is the size of the effect small, medium, or large? Explain what the size of the effect means for this problem.
d = 0.834; large, because 0.834 is greater than Cohen’s 0.8 for a large effect size. The size of the differences between the means of the final exam scores of online students and students in a facetoface class is large, indicating a significant difference.
Weighted alpha is a measure of riskadjusted performance of stocks over a period of a year. A high positive weighted alpha signifies a stock whose price has risen, while a small positive weighted alpha indicates an unchanged stock price during the time period. Weighted alpha is used to identify companies with strong upward or downward trends. The weighted alpha for the top 30 stocks of banks in the Northeast and in the West as identified by Nasdaq on May 24, 2013 are listed in Table 10.6 and Table 10.7, respectively.
94.2  75.2  69.6  52.0  48.0  41.9  36.4  33.4  31.5  27.6 
77.3  71.9  67.5  50.6  46.2  38.4  35.2  33.0  28.7  26.5 
76.3  71.7  56.3  48.7  43.2  37.6  33.7  31.8  28.5  26.0 
126.0  70.6  65.2  51.4  45.5  37.0  33.0  29.6  23.7  22.6 
116.1  70.6  58.2  51.2  43.2  36.0  31.4  28.7  23.5  21.6 
78.2  68.2  55.6  50.3  39.0  34.1  31.0  25.3  23.4  21.5 
Is there a difference in the weighted alpha of the top 30 stocks of banks in the Northeast and in the West? Test at a 5 percent significance level. Answer the following questions:
 Is this a test of two means or two proportions?
 Are the population standard deviations known or unknown?
 Which distribution do you use to perform the test?
 What is the random variable?
 What are the null and alternative hypotheses? Write the null and alternative hypotheses in words and in symbols.
 Is this test right, left, or twotailed?
 What is the pvalue?
 Do you reject or not reject the null hypothesis?
 At the ______ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ______.
 Calculate Cohen’s d and interpret it.