Instruction on Web Applications Dr. Wolfgang Rolke

What to do

Bias-Variance

Run the movie to change the amount of bias and/or the amount of variation

Different Sampling Methods

Choose a dataset and a sampling method and run the movie.

Construct a Frequency Table and Histogram

Select the classwidth and the left endpoint , click on the Plot tab. If there is a problem adjust the classwidth, the left endpoint and the adjustment of the right edge of the class until everything works.

Mean or Median?

Move the slider for the largest outlier around and ovserve the effect on the boxplot, the hsitogram, the mean and the median

Calculate a Standard Deviation

Choose a type of data (they differ by size and the number of digits). Calculate the standard deviation yourself, then choose the Calculations tab (or the Short Cut Formula tab) to see the correct solution

Empirical Rule

Choose a type of data (they differ by size and the number of digits). Find the interval from the empirical rule, then check the correct solution

Percentiles

Choose a type of data (they differ by size and the number of digits). Find the percentile yourself, then choose the Show Solution button to see the correct solution

Drawing a Boxplot

Choose a dataset and draw the boxplot. Then play the movie and check your answer.

The Geometric Distribution

Select an example (roll dice until first six, flip coin until first heads or a general example). Run movie first slow to see each trial, then fast and finally immediate. There you can comare the sample staistics with the population parameters.

The Normal Distribution

Draws a histogram for observations from a normal distribution with mean μ and standard deviation σ. Add the theoretical curve and cmpare toa standard normal.

Illustrate Probabilities under a Normal Curve

Choose the population mean and standard deviation. Choose the x points you want and get the probability. Choosing just one x point gives probabilities of the form X<x or X>x.

For finding x in P(X<x)=p or P(X>x)=p, choose Yes in Want Inverse Probabilitites?

Central Limit Theorem

Choose a distribution and consider how "non-normal" it is accrding to the graphs. Now run the movie and see how the distribution of sample means becomes more and more normal.

Instead of the graphs on the Probabilities tab we compare the exact probabilities with those of the standard normal, and again one can see how they become more and more the same.

Illustration of the Meaning of a Confidence Interval

As the app starts the page on the right is empty, there is no data yet. In the panel on the left you can choose the population parameters that you want.

Next move the slider to 1. Now on the Single Experiment tab you get one simulated dataset, the Summary Statistics and the confidence interval calculations. You can now run the movie and see a sequence of simulated datasets.

You can also play around and see the effects of a

a) larger sample size n → smaller intervals

b) change population mean μ → changes location of interval but not its size

c) increase population standard deviation σ→ increases range of data, increases length of interval.

d) increase confidence level α→ increases length of interval.

2) on Many Experiment tab

no matter how n, μ or σ are changed, the percentage of good intervals always matches the chosen confidence level

Introduction to the ideas of Hypothesis Testing via Coin toss

The app flips a coin 100 times and shows the results. By default it is a fair coin (p=0.5) and we are testing

Null Hypothesis H₀: the coin is fair (p=0.5) vs Alternative Hypothesis H_a: coin is not fair (p≠0.5)

Next we can decide when we want to reject the theory of a fair coin by choosing a Rejection Region. In this experiment we would reject the theory of a fair coin if the number of heads is far from 50 (=100*0.5).

What do you think this should be? Set the sliders accordingly.

Now click the Run! button and repeat the experiment 20 times. Each time you can see the number of heads and whether (in red) or not (in blue) we would reject the theory.

Doing this one at a time is a bit slow, and we really should do this many times, so switch to the Many Runs tab. Here we see the results of 100000 runs of this experiment.

Now move the sliders for the Rejection Region to 45 and 55. The cases in blue are those where the 100 flips resulted in a number of heads between 45 aand 55, and we would not reject the theory of a fair coin. In red we have all the cases with either less than 45 or more than 55 heads, and so here we would reject the theory of a fair coin. As we see that happens about 27% of the time.

But we are still flipping a fair coin, so we should not reject the theory at all, doing so is an error. Soon we will call this the type I error. The 27% will be called the type I error probability α.

Committing an error in about 1 in 4 cases (~27%) does not sound like a good idea. So let's make this much smaller. Move the slider to 40-60, and then then we have α=3.5%, much better.

But there is also a downside to this. Let's select a Slightly unfair (p=0.6) coin. Now the coin is NOT fair, and we should reject the theory. But we are doing so only 46% of the time, the other 54% of the runs wrongly make the theory look ok. This mistake is callrd the type II error. The 54% is called the type II error probability β. The percentage of runs that correctly reject the theory is called the power of the test.

Now if we go back to Rejection Region 45-55 the percentage of correctly rejected false theories goes up to 82.3%, much better.

Of course in real life we do not know whether the coin is fair or not, so we how do we choose the Rejection Region? We do it by choosing a type I error probability α that seems acceptable to us. Often this is about 5%, and that leads to 41-59.

Once we have decided on α we can the do some math to see what the β might be, but this will depend on how unfair the coin might be.

Illustration of the concept of p Value

As the app starts the page on the right shows the chosen type I error probability α, the null and the alternative hypothesis. There is no data yet.

This illustrates one important fact about hypothesis testing: α, H₀ and H_a do NOT depend on the data, they come from the problem/experiment we are working on.

Next move the slider to 1. Now on the Single Experiment tab you get one simulated dataset, the p-value of the corresponding test and the decision on the test (reject/ fail to reject H₀). You can now run the movie and see a sequence of simulated datasets.

Switch to the Many Experiment tab

Case I: μ=10, H₀ is true

This shows the histogram of 1000 hypothesis tests just like the one on the Single Experiment tab. In each test if p<α (drawn in red) we reject H₀, otherwise we fail to reject H₀. The app shows the percentage of tests with p<α, which should be close to α!

Changing the sample size n or the population standard deviation σ does not change any of this.

Changing α changes the percentage of rejected tests so that it always matches α.

Case I: μ≠10, H₀ is false

Move slider to μ=11.0. Now the number of tests with p<α is much higher (63%), which is good because this means we would correctly reject this false H₀ 63% of the time. Move the slider to μ=12.0 and now almost all the test have p<α.

Move slider to μ=11.0 and se that

• larger sample size n (=100) → reject more tests (91.3% vs 63%)

• increase population standard deviation σ (=9) → reject fewer tests (12.4% vs 63%)

d) increase confidence level α =0.1) → reject more tests (74.4% vs 63%)

Illustration of the concept of the Power of a Test

The app generates n observations from a Normal distribution with mean μ and standard deviation σ. Then it does the test

H₀: μ=10.0 vs H_a: μ>10.0

To start μ=11.0, so H₀ is false and should be rejected. Run the Movie and see what happens if we do this 100 times. About half the time we make the right decision, and so the power of this test is 50%.

Select the Show Power Curve button, and you get the theoretical power curve with the actual power, closely matching the simulation result.

Now you can change the situation by changing the true μ to 12.0, the standard deviation from 3 to 5, the sample size from 25 to 50 and the type I error probability α from 5% to 10%. Observe how each of these changes affects the power.

Illustration of the concept of Power of a Test using Proportions

• Effect of α: if we do the test at the α=1% level instead of 5% we reject the null if X< 37 or X> 63, so a smaller α makes it harder to reject the null.

• Effect of sample size: if we change the n=100 to n=1000 we see that we now reject if X<469 or X>531, but notice that 40/100=0.4, whereas 469/1000=0.469, so now we are much closer to 0.5

• Effect of true π: move the slider for the "Population Success Probability π" away from 0.5. Remember we are NOT changing the test

H₀: π=0.5 vs H_a: π≠0.5

so now the null hypothesis is false and we should reject it. As π moves further away from 0.5 this is what happens more often, for example if π=0.6 the null is rejected 37.9% of the time and if π=0.7 it is rejected 96.5% of the time.

This is the probability that the test rejects a false null hypothesis, it is called the power of the test.

• Effect of α when null is false: set π=0.6 and select α=0.01. now the power goes down from 37.9% to just 17.7%. so if we make it harder to reject the null (by using a smaller α) we also make it harder to reject a false null (the power of the test is lower)

• Effect of sample size when null is false: set π=0.6 and select n=500, and we see that the power jumps from 37.9% to 98.6%. with a larger sample it is easier to reject a false null hypothesis.

Illustration of the concept of Power of a Test using the Mean

If μ=50 and we test H₀: μ=50, so that the null hypothesis is true

• Effect of α: if we do the test at the α=5% level we reject the null if X̅ < 48.6 or X̅ > 51.4 but if we do the test at the α=1% level we reject the null if

X̅ < 48.2 or X̅ > 51.8 , so a smaller α makes it harder to reject the null.

• Effect of sample size: if we change the n=50 to n=100 we see that we now reject if X̅ < 49 or X̅ > 51, so a larger n makes it easier to reject the null.

• Effect of population standard deviation σ: if we increase σ from 5 to 10 we now reject if X̅ < 47.2 or X̅ > 52.8, so a larger σ makes it harder to reject the null.

• Effect of true μ: move the slider for the "Population Mean μ". Remember we are NOT changing the test

H₀: μ=50 vs H_a: μ≠50

so now the null hypothesis is false and we should reject it. As μ moves further away from 50 this is what happens more often, for example if μ=52 the null is rejected 80.1% of the time.

This is the probability that the test rejects a false null hypothesis, it is called the power of the test.

• Effect of α when null is false: set μ=52 and select α=0.01. now the power goes down from 80.1% to just 59%. so if we make it harder to reject the null (by using a smaller α) we also make it harder to reject a false null (the power of the test is lower)

• Effect of sample size when null is false: set μ=52 and select n=100, and we see that the power jumps from 80.1% to 98%. with a larger sample it is easier to reject a false null hypothesis.

You can also choose one-sided alternatives and see how that changes things

Random Problem Generator for Statistical Inference

This app randomly generates a problem for either finding a confidence interval or doing a hypothesis test, either for one mean or one proportion. If just one of these is wanted you can specify it as well.

You should do the problem first yourself, and then click on Show Complete Solution. If you need some help you can get the summary statistics and the formula.

Illustration of the Meaning of the Correlation Coefficient

Scatterplot panel: Move slider around to see different cases of the scatterplot of correlated variables. Include a few outliers and see how that effects that "look" of the scatterplot and the sample correlation coefficient

Pick your own Points tab: pick points by clicking inside the graph and see how the correlation coefficient changes

Histogram tab: we can study the effect of changing ρ and/or n on the sample correlation r.

Hypothesis test tab: set ρ=0.1 and observe that we need a sample size of about 500 to have some reasonable chance to reject the null hypothesis of no correlation

Illustration of the Meaning of the Correlation Coefficient

Chisquare analysis of a 2x2 table

run the movies to change one count at a time and observe how the expected cell counts, the chi square statistic and the p-value change

Twosample t Test

The app generates two datasets. one from a normal distribution with mean 10 and standard deviation 3 and the other from a normal distribution with mean μ₂ and standard deviation σ₂. Then it carries out the test H₀: μ₁ = μ₂ vs H_a: μ₁ ≠ μ₂

Single Experiment tab: this shows the result of one experiment with the chosen parameters. The two datasets are shown in the boxplot.

Case 1: μ₂=10, so the null hypothesis is true. In this case the two boxplots should sit next to each other and the p-value should be "large".

Changing the σ₂changes the "size" of the boxplot of Sample 2 but not their relative positions, so the null hypothesis remains true, and the p-value should remain large.

When the slider is moved so that μ₂<10 or μ₂>10 the box to the right moves either up or down, and the p-value becomes small (so the null is rejected)

Running the movie shows different cases with the same numbers, so you can get a feeling for the variation between experiments.

Many Experiments tab: this shows the result of 1000 experiments with the chosen parameters. The histogram shows the p-values of 1000 tests.

Case 1: μ₂=10, so the null hypothesis is true. In this case the percentage of rejected null hypotheses should match the chosen type I error probability α.

Case 1: μ₂≠10 , so the null hypothesis is false. In this case the percentage of rejected null hypotheses is the power of the test. For example, if μ₂=12.0, the power is about 92%.

Two interesting questions specific to this test are:

• does it matter whether we heve equal sample sizes?

choosing n₁=10 and n₂=90 shows that this does not change the true type I error probability, but if we set μ₂=12.0 the power is now 51% much lower than the 92% before, even so the total sample size is still 100. So unequal sample sizes change the power of the test.

• does it matter whether we have equal variances?

choosing σ₂ does not change change the true type I error probability as along as the sampe sizes as the same but now it does change it if the sample sizes are different, for example if σ₂=5.0, n₁=10 and n₂=90 the true type I error probability is just 0.5%, not 5%

The problem is that we have chosen the Equal Variances=Yes option, so the test assumes them to be the same, but they are not. We can fix that by selecting Equal Variances=No, but this requires equal sample sizes.

Illustration of the Idea behind Analysis of Variance

Just run the movie. As the means move farther apart the boxplots do as well. In the begining the overall variance is close the within-group variances, but as the means move farther apart so does the overall variance from the within-group variances. So by comparing the within-group variances to the overall variance we can see whether the means are the same or not.

We can choose to have all three groups different, or two the same and the third different. An interesting fact ot observe is this: if all groups are different and n=20 the p-value is 0.04 when μ=0.6, but if only group C is differnt the p-value is already 0.01, so the test considers this "more different"

Illustration of Least Squares Regression

Just play around with different slopes and intercepts and see how the fitted line plot and the regression equation change. Especially interesting is to change the residual standard deviation when there is a sizable slope.

Illustration of Least Squares Regression I

Run the movie to see and illustration of the meaning of the least squares regression line as the mean of the responses for fixed x values

Illustration of Least Squares Regression II

Just play around with different y values to get a line that looks good to you, then compare it to the least squares regression line and even to the true population line.

Calculations for the Correlation Coefficient and of Least Squares Regression

Choose the type of data and the number of observations, then do the clculations yourself. Finally check your answers.

Assumptions of Least Squares Regression

for each assumption it runs a movie, starting with the case where the assumption is fullfilled and making things worse and worse

Visualizing Different Probability Distributions

On the Data Graphs panel the app shows the histogram, boxplot and normal probability plot for any distribution you want. Names have to be the R names of the distribution. Depending on the distribution pick either one or two parameters.

On the Theoretical Distribution tab the app shows just that

Examples: name (par1) or name (par1,par2)

binom (n,p), geom (p), pois (lambda), uniform (A,B), exp (lambda), gamma (α,β), beta (α,β) , t (df), chisq (df), F(df1,df2) etc.

Transformations

Choose a dataset, or upload your own, and a transformation to see how it changes the data. In the case of scatterplot data you can choose different transformations for x and y. You can pick from a short list of transformations, or write your own (in R syntax)

Polynomial Regression

The app shows how to do polynomial regression. There are 5 built-in examples or you can do your own . For each the sequence of tests for the highest order term are shown, until the highest order term is no longer significant. For the best polynomial model the fitted line plot and the residual vs fits plots are shown

Taylor Approximation

This app shows the Taylor approximation to a function . The approximations are of orders 1-4, and one can do one of these or all of them. There are a few built-in examples, and one can enter any other (in R syntax)

Illustration of Derivatives

Enter a function (in R syntax) and the desired range, then run the movie and see observe that the slope of the tangent line (graph on top) is the derivative (graph on the bottom)

Approximating an Integral by a Riemann Sum

Enter a function (in R syntax) and the desired range, then run the movie and see how well the Riemann sum approximates the integral.

Also check the effect of the choice of intermediate points.

Illustration of several Numerical Integration Methods

Enter a function (in R syntax), the desired range and the approximation method (currently Riemann sums with different choices of intermediate points, Trapazoidal rule and Simpson's rule), then run the movie and see how well they approximate the integral.

Random Walk in 1, 2 or 3 Dimensions

Choose a distance A that the random walk is supposed to reach