Choose the population mean and standard deviation. Choose the x points you want and get the probability. Choosing just one x point gives probabilities of the form X<x or X>x.
For finding x in P(X<x)=p or P(X>x)=p, choose Yes in Want Inverse Probabilitites?
Choose a distribution and consider how "non-normal" it is accrding to the graphs. Now run the movie and see how the distribution of sample means becomes more and more normal.
Instead of the graphs on the Probabilities tab we compare the exact probabilities with those of the standard normal, and again one can see how they become more and more the same.
As the app starts the page on the right is empty, there is no data yet. In the panel on the left you can choose the population parameters that you want.
Next move the slider to 1. Now on the Single Experiment tab you get one simulated dataset, the Summary Statistics and the confidence interval calculations. You can now run the movie and see a sequence of simulated datasets.
You can also play around and see the effects of a
a) larger sample size n → smaller intervals
b) change population mean μ → changes location of interval but not its size
c) increase population standard deviation σ→ increases range of data, increases length of interval.
d) increase confidence level α→ increases length of interval.
2) on Many Experiment tab
no matter how n, μ or σ are changed, the percentage of good intervals always matches the chosen confidence level
The app flips a coin 100 times and shows the results. By default it is a fair coin (p=0.5) and we are testing
Null Hypothesis H0: the coin is fair (p=0.5) vs Alternative Hypothesis Ha: coin is not fair (p≠0.5)
Next we can decide when we want to reject the theory of a fair coin by choosing a Rejection Region. In this experiment we would reject the theory of a fair coin if the number of heads is far from 50 (=100*0.5).
What do you think this should be? Set the sliders accordingly.
Now click the Run! button and repeat the experiment 20 times. Each time you can see the number of heads and whether (in red) or not (in blue) we would reject the theory.
Doing this one at a time is a bit slow, and we really should do this many times, so switch to the Many Runs tab. Here we see the results of 100000 runs of this experiment.
Now move the sliders for the Rejection Region to 45 and 55. The cases in blue are those where the 100 flips resulted in a number of heads between 45 aand 55, and we would not reject the theory of a fair coin. In red we have all the cases with either less than 45 or more than 55 heads, and so here we would reject the theory of a fair coin. As we see that happens about 27% of the time.
But we are still flipping a fair coin, so we should not reject the theory at all, doing so is an error. Soon we will call this the type I error. The 27% will be called the type I error probability α.
Committing an error in about 1 in 4 cases (~27%) does not sound like a good idea. So let's make this much smaller. Move the slider to 40-60, and then then we have α=3.5%, much better.
But there is also a downside to this. Let's select a Slightly unfair (p=0.6) coin. Now the coin is NOT fair, and we should reject the theory. But we are doing so only 46% of the time, the other 54% of the runs wrongly make the theory look ok. This mistake is callrd the type II error. The 54% is called the type II error probability β. The percentage of runs that correctly reject the theory is called the power of the test.
Now if we go back to Rejection Region 45-55 the percentage of correctly rejected false theories goes up to 82.3%, much better.
Of course in real life we do not know whether the coin is fair or not, so we how do we choose the Rejection Region? We do it by choosing a type I error probability α that seems acceptable to us. Often this is about 5%, and that leads to 41-59.
Once we have decided on α we can the do some math to see what the β might be, but this will depend on how unfair the coin might be.
As the app starts the page on the right shows the chosen type I error probability α, the null and the alternative hypothesis. There is no data yet.
This illustrates one important fact about hypothesis testing: α, H0 and Ha do NOT depend on the data, they come from the problem/experiment we are working on.
Next move the slider to 1. Now on the Single Experiment tab you get one simulated dataset, the p-value of the corresponding test and the decision on the test (reject/ fail to reject H0). You can now run the movie and see a sequence of simulated datasets.
Switch to the Many Experiment tab
Case I: μ=10, H0 is true
This shows the histogram of 1000 hypothesis tests just like the one on the Single Experiment tab. In each test if p<α (drawn in red) we reject H0, otherwise we fail to reject H0. The app shows the percentage of tests with p<α, which should be close to α!
Changing the sample size n or the population standard deviation σ does not change any of this.
Changing α changes the percentage of rejected tests so that it always matches α.
Case I: μ≠10, H0 is false
Move slider to μ=11.0. Now the number of tests with p<α is much higher (63%), which is good because this means we would correctly reject this false H0 63% of the time. Move the slider to μ=12.0 and now almost all the test have p<α.
Move slider to μ=11.0 and se that
• larger sample size n (=100) → reject more tests (91.3% vs 63%)
• increase population standard deviation σ (=9) → reject fewer tests (12.4% vs 63%)
d) increase confidence level α =0.1) → reject more tests (74.4% vs 63%)
The app generates n observations from a Normal distribution with mean μ and standard deviation σ. Then it does the test
H0: μ=10.0 vs Ha: μ>10.0
To start μ=11.0, so H0 is false and should be rejected. Run the Movie and see what happens if we do this 100 times. About half the time we make the right decision, and so the power of this test is 50%.
Select the Show Power Curve button, and you get the theoretical power curve with the actual power, closely matching the simulation result.
Now you can change the situation by changing the true μ to 12.0, the standard deviation from 3 to 5, the sample size from 25 to 50 and the type I error probability α from 5% to 10%. Observe how each of these changes affects the power.
• Effect of sample size: if we change the n=100 to n=1000 we see that we now reject if X<469 or X>531, but notice that 40/100=0.4, whereas 469/1000=0.469, so now we are much closer to 0.5
• Effect of true π: move the slider for the "Population Success Probability π" away from 0.5. Remember we are NOT changing the test
H0: π=0.5 vs Ha: π≠0.5
so now the null hypothesis is false and we should reject it. As π moves further away from 0.5 this is what happens more often, for example if π=0.6 the null is rejected 37.9% of the time and if π=0.7 it is rejected 96.5% of the time.
This is the probability that the test rejects a false null hypothesis, it is called the power of the test.
• Effect of α when null is false: set π=0.6 and select α=0.01. now the power goes down from 37.9% to just 17.7%. so if we make it harder to reject the null (by using a smaller α) we also make it harder to reject a false null (the power of the test is lower)
• Effect of sample size when null is false: set π=0.6 and select n=500, and we see that the power jumps from 37.9% to 98.6%. with a larger sample it is easier to reject a false null hypothesis.
If μ=50 and we test H0: μ=50, so that the null hypothesis is true
• Effect of α: if we do the test at the α=5% level we reject the null if X̅ < 48.6 or X̅ > 51.4 but if we do the test at the α=1% level we reject the null if X̅ < 48.2 or X̅ > 51.8 , so a smaller α makes it harder to reject the null.• Effect of sample size: if we change the n=50 to n=100 we see that we now reject if X̅ < 49 or X̅ > 51, so a larger n makes it easier to reject the null.
• Effect of population standard deviation σ: if we increase σ from 5 to 10 we now reject if X̅ < 47.2 or X̅ > 52.8, so a larger σ makes it harder to reject the null.
• Effect of true μ: move the slider for the "Population Mean μ". Remember we are NOT changing the test
H0: μ=50 vs Ha: μ≠50
so now the null hypothesis is false and we should reject it. As μ moves further away from 50 this is what happens more often, for example if μ=52 the null is rejected 80.1% of the time.
This is the probability that the test rejects a false null hypothesis, it is called the power of the test.
• Effect of α when null is false: set μ=52 and select α=0.01. now the power goes down from 80.1% to just 59%. so if we make it harder to reject the null (by using a smaller α) we also make it harder to reject a false null (the power of the test is lower)
• Effect of sample size when null is false: set μ=52 and select n=100, and we see that the power jumps from 80.1% to 98%. with a larger sample it is easier to reject a false null hypothesis.
You can also choose one-sided alternatives and see how that changes things
This app randomly generates a problem for either finding a confidence interval or doing a hypothesis test, either for one mean or one proportion. If just one of these is wanted you can specify it as well.
You should do the problem first yourself, and then click on Show Complete Solution. If you need some help you can get the summary statistics and the formula.
Scatterplot panel: Move slider around to see different cases of the scatterplot of correlated variables. Include a few outliers and see how that effects that "look" of the scatterplot and the sample correlation coefficient
Pick your own Points tab: pick points by clicking inside the graph and see how the correlation coefficient changes
Histogram tab: we can study the effect of changing ρ and/or n on the sample correlation r.
Hypothesis test tab: set ρ=0.1 and observe that we need a sample size of about 500 to have some reasonable chance to reject the null hypothesis of no correlation
run the movies to change one count at a time and observe how the expected cell counts, the chi square statistic and the p-value change
The app generates two datasets. one from a normal distribution with mean 10 and standard deviation 3 and the other from a normal distribution with mean μ2 and standard deviation σ2. Then it carries out the test H0: μ1 = μ2 vs Ha: μ1 ≠ μ2
Single Experiment tab: this shows the result of one experiment with the chosen parameters. The two datasets are shown in the boxplot.Case 1: μ2=10, so the null hypothesis is true. In this case the two boxplots should sit next to each other and the p-value should be "large".
Changing the σ2 changes the "size" of the boxplot of Sample 2 but not their relative positions, so the null hypothesis remains true, and the p-value should remain large.
When the slider is moved so that μ2<10 or μ2>10 the box to the right moves either up or down, and the p-value becomes small (so the null is rejected)
Running the movie shows different cases with the same numbers, so you can get a feeling for the variation between experiments.
Many Experiments tab: this shows the result of 1000 experiments with the chosen parameters. The histogram shows the p-values of 1000 tests.
Case 1: μ2=10, so the null hypothesis is true. In this case the percentage of rejected null hypotheses should match the chosen type I error probability α.
Case 1: μ2≠10 , so the null hypothesis is false. In this case the percentage of rejected null hypotheses is the power of the test. For example, if μ2=12.0, the power is about 92%.
Two interesting questions specific to this test are:
• does it matter whether we heve equal sample sizes?
choosing n1=10 and n2=90 shows that this does not change the true type I error probability, but if we set μ2=12.0 the power is now 51% much lower than the 92% before, even so the total sample size is still 100. So unequal sample sizes change the power of the test.
• does it matter whether we have equal variances?
choosing σ2 does not change change the true type I error probability as along as the sampe sizes as the same but now it does change it if the sample sizes are different, for example if σ2=5.0, n1=10 and n2=90 the true type I error probability is just 0.5%, not 5%
The problem is that we have chosen the Equal Variances=Yes option, so the test assumes them to be the same, but they are not. We can fix that by selecting Equal Variances=No, but this requires equal sample sizes.
Just run the movie. As the means move farther apart the boxplots do as well. In the begining the overall variance is close the within-group variances, but as the means move farther apart so does the overall variance from the within-group variances. So by comparing the within-group variances to the overall variance we can see whether the means are the same or not.
We can choose to have all three groups different, or two the same and the third different. An interesting fact ot observe is this: if all groups are different and n=20 the p-value is 0.04 when μ=0.6, but if only group C is differnt the p-value is already 0.01, so the test considers this "more different"
Choose the type of data and the number of observations, then do the clculations yourself. Finally check your answers.
On the Data Graphs panel the app shows the histogram, boxplot and normal probability plot for any distribution you want. Names have to be the R names of the distribution. Depending on the distribution pick either one or two parameters.
On the Theoretical Distribution tab the app shows just that
Examples: name (par1) or name (par1,par2)
binom (n,p), geom (p), pois (lambda), uniform (A,B), exp (lambda), gamma (α,β), beta (α,β) , t (df), chisq (df), F(df1,df2) etc.
Enter a function (in R syntax) and the desired range, then run the movie and see how well the Riemann sum approximates the integral.
Also check the effect of the choice of intermediate points.
Enter a function (in R syntax), the desired range and the approximation method (currently Riemann sums with different choices of intermediate points, Trapazoidal rule and Simpson's rule), then run the movie and see how well they approximate the integral.