(due Wednesday April 11)

Confidence Intervals and the Central Limit Theorem

< What you must turn in for this assignment is the completed session window, together with the annotated graph you generate. Since by now you have considerable experience with Minitab, the directions are less detailed.>

< There are two separate, unrelated parts to this assignment: one part on confidence intervals, one on the central limit theorem.>

Minitab Assignment 4: Confidence Intervals and the Central Limit Theorem

Your Name

I. CONFIDENCE INTERVALS

Here we see the meaning of "confidence interval" in some examples.

To be specific, we will just consider 95.4% confidence intervals.

< We use 95.4% instead of, for example, 95%, because with 95% we would have the more awkward number 1.96 in place of 2 in the confidence interval formula below.>

Review.

Consider the following experiment. Choose a large integer n. Take a random sample of n measurements from some fixed population with mean mu and standard deviation sigma, and compute the sample average xbar of the numbers you get. In this case we say that the interval

( xbar - 2[sigma/(square root of n)] , xbar + 2[sigma/(square root of n) )

is a 95.4% confidence interval for mu.

Here is the meaning of "95.4% confidence interval": if we repeat this sampling experiment many times (with the same sample size n), then in about 95.4% of those experiments, the interval we compute from the formula will contain the actual population mean mu.

The purpose of this exercise is to see this in some examples.

Example 1: population distribution is Bernoulli (p=.5)

We will examine 200 sampling experiments, of size 25 samples, from a Bernoulli(.5) distribution. (Think of the Minitab Crew as flipping fair coins, and recording 0 or 1 depending on how the coin falls.)

First, for this distribution and sample size, the 95.4% confidence interval for mu is (xbar -.2 , xbar + .2). Type in a brief explanation for why it is the correct interval.

Now let us see how well this confidence interval works. Use

Calc> Random Data> Bernoulli

to generate 25 rows, in columns C1-C200, with probability of success .5, OK.

Now use

Stat > Basic Statistics > Display Descriptive Statistics

and for variables type in C1-C200. OK.

The statistics will appear in the session window. Following the Descriptive Statistics heading, on the same line type (Bernoulli, p= .5), then the heading will read

Descriptive Statistics (Bernoulli, p=.5)

Erase all the rows, except for those for which the population mean (which is .5) does not fall into your confidence interval. This happens when your sample average xbar is below .3 or above .7 .

< For example, when I did this, I produced the following:

Descriptive Statistics (Bernoulli .5)

Variable N Mean Median TrMean StDev SE

C11 25 0.7200 1.0000 0.7391 0.4583 0.091

C40 25 0.2800 0.0000 0.2609 0.4583 0.091

C46 25 0.7200 1.0000 0.7391 0.4583 0.091

C59 25 0.2800 0.0000 0.2609 0.4583 0.091

C69 25 0.2400 0.0000 0.2174 0.4359 0.087

C79 25 0.2400 0.0000 0.2174 0.4359 0.087

C84 25 0.2800 0.0000 0.2609 0.4583 0.091

C100 25 0.2800 0.0000 0.2609 0.4583 0.091

C106 25 0.2000 0.0000 0.1739 0.4082 0.081

C129 25 0.7200 1.0000 0.7391 0.4583 0.091

C155 25 0.7200 1.0000 0.7391 0.4583 0.091

C162 25 0.7200 1.0000 0.7391 0.4583 0.091 >

Count how many of the columns have their mean falling into your 95.4% confidence interval (i.e., count how many rows you have left). Write this into the session window. Briefly comment on how this compares to what you expect.

Example 2: Uniform distribution on [0,1]

Now we do the same exercise for Uniform distribution, changing what obviously must be changed for the different distribution.

For this, use Calc > Random Data > Uniform , etc. Leave the endpoints set at 0 and 1.

In this case the Minitab Crew, instead of flipping a coin, is picking numbers at random from the unit interval. When computing and typing in the 95.4% confidence interval, we use the fact that the standard deviation for the uniform distribution on [0,1] is approximately .29 .

II. CENTRAL LIMIT THEOREM

Delete the data in the data windows. Generate from uniform [0,1] data 800 rows, stored in columns C1-C5. Then use

Calc > Row statistics

choose the setting for average, set input variables C1-C5, and put the output in C6.

At this point, you have 800 repetitions of the following experiment: choose a random sample of 5 numbers from [0,1], and take the average. The 800 sample averages are stored in C6.

< The somewhat odd choice of numbers 5 and 800 is made because 6 x 800 = 4800, so our choice doesn't exceed the capacity of Student Minitab. >

Erase columns C1-C5, leaving just the column of sample averages, which is now column C1. Use

Calc > Standardize ,

input columns C1, store results in C2. Use

Graph > Histogram

and within the histogram box

-type in C2 for Graph variables

-use the annotation option to title your graph (e.g. Unif[0,1], sample average, n=5, 800 trials)

-use the annotation option to put on your name as a footnote

-choose the density (not frequency) option.

OK.

Well. The resulting histogram is a frequency (probability) histogram built from 800 numbers. The 800 numbers come from 800 experiments with size-5 samples. They are standardized outputs.

The number of trials (800) is big enough that you should have a fairly good approximation of the probability distribution for the sample mean from a size-5 sample. For the uniform distribution, the Central Limit Theorem convergence is very rapid; even though the sample size is only 5, the distribution of the sample average already resembles the bell shaped curve. Comment on the resemblance of your density histogram bears to the bell shaped curve of the standard normal distribution.