## MINITAB Assignment 4 (due Wednesday April 11) Confidence Intervals and the Central Limit Theorem

< Text (such as this) which is enclosed in angle brackets, < >, is communication about what to do in the assignment, or commentary about it. I will also color such text green. Text which is not green and bracketed is text you should type or paste into the session window, perhaps with obvious changes, such as replacing Your Name with your name. This will make your minitab work more meaningful on review and in execution. Please try to understand the exercise.>

< What you must turn in for this assignment is the completed session window, together with the annotated graph you generate. Since by now you have considerable experience with Minitab, the directions are less detailed.>

< There are two separate, unrelated parts to this assignment: one part on confidence intervals, one on the central limit theorem.>

Minitab Assignment 4: Confidence Intervals and the Central Limit Theorem

I. CONFIDENCE INTERVALS

Here we see the meaning of "confidence interval" in some examples.
To be specific, we will just consider 95.4% confidence intervals.

< We use 95.4% instead of, for example, 95%, because with 95% we would have the more awkward number 1.96 in place of 2 in the confidence interval formula below.>

Review.
Consider the following experiment. Choose a large integer n. Take a random sample of n measurements from some fixed population with mean mu and standard deviation sigma, and compute the sample average xbar of the numbers you get. In this case we say that the interval
(  xbar - 2[sigma/(square root of n)]   ,  xbar + 2[sigma/(square root of n)  )
is a 95.4% confidence interval for mu.

Here is the meaning of "95.4% confidence interval": if we repeat this sampling experiment many times (with the same sample size n), then in about 95.4% of those experiments, the interval we compute from the formula will contain the actual population mean mu.

The purpose of this exercise is to see this in some examples.

Example 1: population distribution is Bernoulli (p=.5)

We will examine 200 sampling experiments, of size 25 samples, from a Bernoulli(.5) distribution. (Think of the Minitab Crew as flipping fair coins, and recording 0 or 1 depending on how the coin falls.)

First, for this distribution and sample size, the 95.4% confidence interval for mu is (xbar -.2   ,  xbar + .2). Type in a brief explanation for why it is the correct interval.

Now let us see how well this confidence interval works. Use
Calc> Random Data> Bernoulli
to generate 25 rows, in columns C1-C200, with probability of success .5, OK.

Now use
Stat > Basic Statistics > Display Descriptive Statistics
and for variables type in C1-C200. OK.

The statistics will appear in the session window. Following the Descriptive Statistics heading, on the same line type (Bernoulli, p= .5), then the heading will read
Descriptive Statistics (Bernoulli, p=.5)
Erase all the rows, except for those for which the population mean (which is .5) does not fall into your confidence interval. This happens when your sample average xbar is below .3 or above .7 .

< For example, when I did this, I produced the following:

Descriptive Statistics (Bernoulli .5)

Variable   N     Mean       Median TrMean StDev SE
C11           25     0.7200     1.0000 0.7391 0.4583 0.091
C40           25     0.2800     0.0000 0.2609 0.4583 0.091
C46           25     0.7200     1.0000 0.7391 0.4583 0.091
C59           25     0.2800     0.0000 0.2609 0.4583 0.091
C69           25     0.2400     0.0000 0.2174 0.4359 0.087
C79           25     0.2400     0.0000 0.2174 0.4359 0.087
C84           25     0.2800     0.0000 0.2609 0.4583 0.091
C100         25     0.2800     0.0000 0.2609 0.4583 0.091
C106         25     0.2000     0.0000 0.1739 0.4082 0.081
C129         25     0.7200     1.0000 0.7391 0.4583 0.091
C155         25     0.7200     1.0000 0.7391 0.4583 0.091
C162         25     0.7200     1.0000 0.7391 0.4583 0.091     >

Count how many of the columns have their mean falling into your 95.4% confidence interval (i.e., count how many rows you have left). Write this into the session window. Briefly comment on how this compares to what you expect.

Example 2: Uniform distribution on [0,1]

Now we do the same exercise for Uniform distribution, changing what obviously must be changed for the different distribution.

For this, use Calc > Random Data > Uniform , etc. Leave the endpoints set at 0 and 1.

In this case the Minitab Crew, instead of flipping a coin, is picking numbers at random from the unit interval. When computing and typing in the 95.4% confidence interval, we use the fact that the standard deviation for the uniform distribution on [0,1] is approximately .29 .

II. CENTRAL LIMIT THEOREM

Delete the data in the data windows. Generate from uniform [0,1] data 800 rows, stored in columns C1-C5. Then use
Calc > Row statistics
choose the setting for average, set input variables C1-C5, and put the output in C6.

At this point, you have 800 repetitions of the following experiment: choose a random sample of 5 numbers from [0,1], and take the average. The 800 sample averages are stored in C6.

< The somewhat odd choice of numbers 5 and 800 is made because 6 x 800 = 4800, so our choice doesn't exceed the capacity of Student Minitab. >

Erase columns C1-C5, leaving just the column of sample averages, which is now column C1. Use
Calc > Standardize ,
input columns C1, store results in C2. Use
Graph > Histogram
and within the histogram box
-type in C2 for Graph variables
-use the annotation option to title your graph (e.g. Unif[0,1], sample average, n=5, 800 trials)
-use the annotation option to put on your name as a footnote
-choose the density (not frequency) option.
OK.

Well. The resulting histogram is a frequency (probability) histogram built from 800 numbers. The 800 numbers come from 800 experiments with size-5 samples. They are standardized outputs.
The number of trials (800) is big enough that you should have a fairly good approximation of the probability distribution for the sample mean from a size-5 sample. For the uniform distribution, the Central Limit Theorem convergence is very rapid; even though the sample size is only 5, the distribution of the sample average already resembles the bell shaped curve. Comment on the resemblance of your density histogram bears to the bell shaped curve of the standard normal distribution.