Inference for a Population Mean $\mu$

Assumptions
Confidence Interval
Hypothesis Test
Power
Sample Size

After all the theory, here are some examples. Actually, we have discussed almost everything here already.

Method

1-sample t

Assumptions

The methods discussed here work if:

the data comes from a simple random sample
the data comes from a normal distribution or the sample size is large enough

The last assumption is a bit vague, just how large is “large enough”? The basic principle here is that we need a balance:

If the distribution of the data is almost normal, a sample size as small as 10 is ok.
If the distribution of the data is very non-normal (large outliers etc..), a sample size as large as 100 might be needed.

R Routines

one.sample.t - test and confidence interval

t.ps - power and sample size

Confidence Interval

Case Study: Drug Use of Mothers and the Health of the Newborn

Consider again the data set for newborn babies and the drug status of their mothers. Find a 90% confidence interval for the length of babies:

attach(mothers)
sort(Length)

##  [1] 40.2 41.3 41.7 41.9 43.4 44.0 44.3 44.9 45.1 45.3 45.6 45.7 45.7 45.8
## [15] 46.7 46.8 46.9 47.0 47.1 47.1 47.2 47.2 47.3 47.4 47.6 47.7 47.8 47.8
## [29] 48.0 48.1 48.2 48.2 48.4 48.5 48.5 48.5 48.7 48.8 48.8 48.8 48.9 49.0
## [43] 49.2 49.3 49.3 49.5 49.6 49.6 50.1 50.2 50.2 50.4 50.5 50.5 50.5 50.6
## [57] 50.8 50.9 50.9 51.0 51.0 51.0 51.1 51.2 51.3 51.4 51.4 51.4 51.5 51.7
## [71] 51.7 51.7 51.7 51.8 52.1 52.4 52.5 52.5 52.6 52.9 52.9 53.0 53.2 53.5
## [85] 53.7 53.9 54.3 54.8 55.0 55.4 55.4 55.9 56.2 56.5

one.sample.t(Length, conf.level = 90, ndigit = 2)

## A 90% confidence interval for the population mean is (48.97, 50.13)

Assumptions: normal plot ok

Notice the rounding rules for confidence intervals are the usual ones, here one digit more than the data.

Example In a survey 150 people leaving a mall were asked how much money they spent. The mean was $45.60 with a standard deviation of $12.70. Find a 95% confidence interval for the true mean.

 one.sample.t(y = 45.60, shat = 12.70, n = 150, ndigit=2)

## A 95% confidence interval for the population mean is (43.55, 47.65)

Assumptions: can’t be checked (no data) but 150 is fairly large.

Example In a survey people were asked how much money they made last year. The answers were

Income

##  [1] 24100 25200 26800 27100 27300 27400 28200 29300 29500 29600 29700
## [12] 30000 30100 30300 30300 30600 30700 30800 30900 30900 31000 31200
## [23] 31200 31600 31600 31700 31900 32000 32000 32000 32800 33300 33300
## [34] 33300 33700 33900 33900 33900 34100 34100 34200 34400 34400 34800
## [45] 34900 35000 35100 35100 35100 35300 35400 35800 35800 36100 36100
## [56] 36800 37000 37200 37300 37300 37400 37500 37700 37700 38100 38400
## [67] 38600 38800 38900 39400 39400 39500 39500 40600 40900 41600 42400
## [78] 42700 43500 46700

find a $90\%$ confidence interval for the true population mean income.

one.sample.t(Income, conf.level=90, ndigit=-1)

## A 90% confidence interval for the population mean is (33480, 35120)

Assumptions: normal plot ok

Hypothesis Test

The details of the hypothesis test for a population mean are as follows:

Null Hypothesis: $H_0: \mu = \mu_0$

Note: $\mu_0$ is not “$\mu_0$” but a specific number which you need to get from the problem.

Alternative Hypothesis: Choose one of the following, depending on the problem:

$H_a: \mu < \mu_0$
$H_a: \mu > \mu_0$
$H_a: \mu \ne \mu_0$

Case Study: Simon Newcomb’s Measurements of the Speed of Light

attach(newcomb)
speed.of.light <- Measurement
speed.of.light <- speed.of.light[speed.of.light>24780]  
mn <-round(mean(speed.of.light), 1)
mn

## [1] 24827.3

We have previously seen that (after eliminating the outlier 24756) the mean of Newcombs measurements of the speed of light is 24827.3, whereas using modern instruments the equivalent measurement is 24833. Does this say his measuring method was bad? The question is whether this sample mean is statistically significantly different from the population mean.

Let’s answer this question now:

Parameter: mean $\mu$
Method: 1-sample t
Assumptions: normal data or large sample, normalplot is ok
$\alpha = 0.05$
$H_0: \mu = 24833$ (Newcomb’s experiment measured correct value)
$H_a: \mu \ne 24833$ (Newcomb’s experiment did not measure correct value)
p = 0.000

one.sample.t(speed.of.light, mu.null = 24833)

## p value of test H0: mu=24833 vs. Ha: mu <> 24833:  0.000

$p < \alpha$, so we reject the null hypothesis
Newcomb’s experiment did not measure correct value
Assumptions: normal plot ok

Case Study: Resting Period of Monarch Butterflies

Some Monarch butterflies fly early in the day, others somewhat later. After the flight they have to rest for a short period. It has been theorized that the resting period (RIP.sec.) of butterflies flying early in the morning is shorter because this is a thermoregulatory mechanism, and it is cooler in the mornings. The mean RIP of all Monarch butterflies is 133 sec. Test the theory at the 10% level.

Research by Anson Lui, Resting period of early and late flying Monarch butterflies Danaeus plexippus, 1997

Parameter: mean $\mu$
Method: 1-sample t
Assumptions: normal data or large sample
$\alpha = 0.1$
$H_0: \mu =133$ (RIP is the same for early morning flying butterflies as all others)
$H_0: \mu <133$ (RIP is the shorter for early morning flying butterflies)
$p = 0.056$

attach(butterflies) 
sort(RIP.sec.)

##  [1]  52  64  66  75  77  85  86  92  93  98 102 102 103 112 115 117 120
## [18] 121 124 124 124 125 132 132 134 140 142 145 148 148 152 155 156 156
## [35] 167 167 170 177 181 187

one.sample.t(RIP.sec., mu.null=133, alternative = "less")

## p value of test H0: mu=133 vs. Ha: mu < 133:  0.0558

$p = 0.0558 < \alpha = 0.1$, so we reject the null hypothesis
It appears the resting time is somewhat shorter, but the conclusion is not a strong one.

Assumptions: normal plot ok

Example In the past the average purchase of a customer in a certain store was $55 . The store just ran an ad in the newspaper and wants to know whether it increased sales. In week following the ad 43 customers spent an average of $63 with a standard deviation of $18. Test at the 10% level whether the promotion was a success.

Parameter: mean $\mu$
Method: 1-sample t
Assumptions: assumed to be ok
$\alpha = 0.1$
$H_0: \mu = 55$ (same mean sales as before, ad did not work)
$H_a: \mu > 55$ (higher mean sales than before, ad did work)
$p = 0.0028$

one.sample.t(y = 63, shat = 18, n = 43, 
             mu.null = 55, alternative = "greater")

## p value of test H0: mu=55 vs. Ha: mu > 55:  0.0028

$p < \alpha$, so we reject the null hypothesis
the mean of the sales is higher than before, ad did work.

Power

Recall that the power of a test is the probability to reject the null hypothesis when the null hypothesis is indeed wrong.

Calculating the power of a test usually means making a guess what the true value of the parameter might be.

Example Over many years the mean number of accidents per month on a street was 2.15 with a standard deviation of 0.75. The city council is considering to install traffic lights at a number of intersections. After that they will monitor the number of accidents for one year. If it turned out that the lights lower the number of accidents to 1.56 per month, what is the probability that they would detect this drop? Use $\alpha = 0.05$.

The test they will eventually do will have the following:

$\alpha = 0.05$
$H_0: \mu = 2.15$ (Same number of accidents with the traffic lights)
$H_a: \mu < 2.15$ (Lower number of accidents with the traffic lights)

Now to calculate the power:

t.ps(n=12, diff = 1.56-2.15, sigma = 0.75, 
     alternative="less")

## Power of Test = 81.4%

so there is an 81.4% chance to correctly conclude that the traffic lights lowered the number of accidents.

But why 1.56? After all, we have not even installed the traffic lights, so we can’t know what will happen once we do. So we really should look at the whole Power Curve.

Example We are planning a survey of the employees of a large company. In the survey we will ask them how happy they are to work there, on a scale of 1 to 10. Eventually we will test at the 10% level whether

$H_0: \mu =5.0$ vs $H_0: \mu > 5.0$

If we randomly select 250 employees and if the true mean happiness is 5.6, what is the power of this test? Assume $\sigma=1.4$.

t.ps(n=50, diff=5.6-5.0, sigma=1.4, 
     alpha=0.1, alternative = "greater")

## Power of Test = 95.5%

Sample Size Calculations

One of the most important questions facing a researcher is how large a sample he needs to be able to draw valid conclusions. If the goal is to do a hypothesis test the t.ps command is again the way to go:

Example A company has been making “widgets” which have a mean life time of 127 days with a standard deviation of 45.5 days. They have recently redesigned the production process, and believe that now the lifetime is 145 days. They want to test that hypothesis. How many widgets do they need to test to have a 95% chance of detecting this difference? They will carry out the test at the 10% level.

t.ps(diff = 145-127, sigma = 45.5, power = 95, 
     alpha = 0.1, alternative = "greater")

## Sample size required is  57

Let’s say that instead of a hypothesis test we want to find a confidence interval. We have seen that one effect of the sample size is to make the confidence interval shorter:

one.sample.t(10, shat=1, n=20, ndigit=3)

## A 95% confidence interval for the population mean is (9.532, 10.468)

and so the length of the interval is

\[ 10.486-9.535=0.951 \]

but if the sample size is 40 we have

one.sample.t(10, shat=1, n=40, ndigit=3)

## A 95% confidence interval for the population mean is (9.68, 10.32)

\[ 10.32-9.68=0.64 \] and so this interval is shorter.

A sample size calculation starts with a decision on how large an interval we are willing to accept. Let’s call this length L. Usually one specifies the error E, which is

\[ E=L/2 \]

The error E is equivalent to what power we want in the hypothesis testing case above.

Notice that the calculation of the interval also involves shat. This of course is the sample standard deviation, an estimate of the population standard deviation $\sigma$. Here are several possible ideas:

Is there already an estimate of $\sigma$ we can use, maybe from a previous or from a similar study?
If not maybe we can do a pilot study (something that is very often a good idea anyway)

Example: We found that a 90% confidence interval for the mean length of babies to be (48.97, 50.13), or 49.55 $\pm$ 0.58, so the error on this estimate is 0.58. What sample size would be needed to find a 90% confidence interval with an error of 0.25?

We can use the sample standard deviation as a guess for the population standard deviation.

t.ps(sigma= sd(Length), E = 0.25, conf.level = 90)

## [1] "Sample size required is  497"

Example We want to do a survey of the students of the Colegio. One question will be their GPA, and we want to find a 99% confidence interval with a length of 0.25. A pilot study of 25 students had a sample standard deviation of 0.45. How many students will we need in our survey?

length of interval = L = 0.25, so E = L/2 = 0.25/2 = 0.125

t.ps(sigma= 0.45, E = 0.125, conf.level = 99)

## [1] "Sample size required is  86"

But what if we did not do a pilot study and therefore do not know the standard deviation? Sometimes we can make an educated guess.

Remember our old rule of thumb:

\[ \text{Range}/4 = s \] For GPA a likely range is 2-4, so

Range/4 = (4-2)/4 = 0.5 = s = $\sigma$, so

t.ps(sigma= 0.5, E = 0.125, conf.level = 99)

## [1] "Sample size required is  107"

Example We want to do a study of the age at which students graduate from the Colegio. We will find a 90% confidence interval with an error of 1 month. A pilot study showed that the standard deviation of the ages is 0.8 years. What sample size is needed?

t.ps(sigma= 0.8, E = 1/12, conf.level = 90)

## [1] "Sample size required is  250"

Example 10 years ago the mean income in some city was $34500. We are a planing a study to see whether the mean income has increased. We will find a $90\%$ confidence interval with an error of $1000. If the standard deviation is $7500, what sample size is needed?

t.ps(sigma=7500, E=1000, conf.level=90)

## [1] "Sample size required is  153"

if instead we plan on doing a hypothesis test at the 10% level to see whether the income has increased to $36000, what sample size is needed for the test to have a power of 90%?

t.ps(diff=36000-34500, 
     sigma=7500, 
     alpha=0.1, 
     power=90,
     alternative = "greater")

## Sample size required is  166

Inference for a Population Mean \(\mu\)

Method

Assumptions

R Routines

Confidence Interval

Case Study: Drug Use of Mothers and the Health of the Newborn

Hypothesis Test

Case Study: Simon Newcomb’s Measurements of the Speed of Light

Case Study: Resting Period of Monarch Butterflies

Power

Sample Size Calculations