(due Friday March 17)

How sample averages approximate the mean

< In your Minitab reports:

1. You can erase in your session window to get rid of mistakes and excess. Please do this.

2. Please staple the report you hand in. >

< What you must turn in for this assignment is the completed session window, together with the annotated graphs you generate. (The work below was done at home with the Student Minitab 12 which came with the text.) >

Part I

The mean of the distribution is the number M such that the average of a random sample tends to be close to M. As the sample size gets bigger, the sample average tends to be closer and closer to M.

In this exercise we use the Minitab data number generator to see examples of this. The Minitab random data generator produces samples of numbers which mimic the properties of a specified probability rule (distribution).

We will use just the Bernoulli (p) Distribution now. This distribution mimics a coin which is heads ("1") with probability p and is tails ("0") with probability 1-p. The average/mean for the distribution is p.

< Get into Minitab, and type into the session window

Your name

Assignment #2: how sample averages approximate the mean >

We generate 25 samples of size 20 of Bernoulli(p) random data with p=.5, and compute the sample means, as below. (We can think of this as follows: Minitab takes a crew of 25 people, each crew member flips a fair coin 20 times, and Minitab writes down his data in a column.)

We apply Calc > Random Data > Bernoulli , then we

generate 20 rows of data, stored in columns C1-C25.

We apply

Stat > Basic Statistics > Display Descriptive Statistics

< Now erase the rows in your session window which don't have the mean data, for less clutter (the point of this exercise is to look at those means, that is, those sample averages). Notice how the sample averages you see tend to be around .5, but there is considerable variation. Click here to see what resulted in my own session window -- with the randomness, your numbers will be different. >

< Next, in the data window, use the mouse to highlight the column titles C1,C2,...,C25 and then hit the backspace key. This should erase the data from those columns. (This Student Minitab will only handle 5000 data entries in the data window, you are erasing to clear the way for later operations.) >

Now we repeat the procedure, but with 100 rows instead of 20 rows (that is, each of the 25 crew members flips 100 times, not just 20 times). You can click here to see what resulted in my own session window. >

Notice that the means we see, from this set of 25 samples, shows less variation away from .5 then did the previous collection -- this is because the sample size is bigger.

Finally, we erase the data in the data window columns, and then put the Bernoulli(.5) sample data for sample size 1000 in the columns C1-C5, and list out their sample sample averages as before. We do this three times (which amounts to looking at 15 samples -- we can't do them all at once because of that 5000 numbers data limit).

Looking at these 15 samples, you should see the spread of the sample averages is considerably less than it was for the smaller sample sizes.

You can click here to see what resulted in my own session window. >

To summarize: for a given probability distribution: the sample average approximates the mean, and with a bigger sample the approximation is likely to be better, and with a very big sample the approximation is likely to be very good.

PART II

Erase all the column data in the data worksheet.

In the first three rows of C1, type three numbers of your choice. Then:

Calc> Calculator

Store result in C2

expression 3 * C1 + 40

Type into your session window your verbal description of what happened.

Then do

Calc > Column Statistics

Pick Mean and input variable C1 .

Then do

Calc > Column Statistics

Pick Mean and input variable C2.

< In my case, I chose in C1 the three numbers 3,4,5 and I got the following:

Column Mean

Mean of C1 = 4.0000

Column Mean

Mean of C2 = 52.000 >

Now, type into your session window an explanation of how you could have predicted the mean of column 2 just from knowing the mean of column 1 and the pattern defining column 2.

Print out your session window, to be handed in. (Be sure your own name is part of the initial page output, and be sure you have cleaned out the stuff you don't need.)