(due Friday March 17)

Probability Histograms from Sample Data

< What you must turn in for this assignment is the completed session window, together with the annotated graphs you generate. >

Minitab Assignment 3

Your Name

In this assignment we generate probability histograms for random data to better understand the meaning of a probability distribution.

I. Probability Histogram

From a list of numbers, Minitab can create a probability histogram (density histogram). For this, Minitab chooses appropriate bin lengths; counts the fraction of the numbers falling in each bin; and draws over the bin a rectangle whose area equals the fraction of the numbers falling in the bin.

To see an example of the probability histogram, we first have Minitab (as in assignment 2) generate a sequence of 10 numbers from the Bernoulli(.5) distribution. (We can think of Minitab as having a crew which flips fair coins, recording 1 when the coin falls heads, and recording 0 when the coin falls tails.) For this, we use

Calc > Random Data > Bernoulli ;

type in 10 for the number of rows;

and choose C1 for the column in which to save the data.

< Then you must also click something like "ok" to get minitab to execute. I will omit this type of obvious command. >

We generate the probability (density) histogram with the following commands:

we follow Graph > Histogram ;

from options we choose "density";

we click footnote and type in My Name;

we click title and type in Bernoulli(.5) Distribution, 10 trials.

< Now use the printer icon on the top bar menu to print out your histogram. Examine the histogram and understand its relation to the data in C1. For your later graphs, at home it is probably best to issue a print command after each graph is created, in the OWL you might want to print them in larger groups. >

Next we do the same thing as above, but using C2 with 50 rows and C3 with 5000 rows, and changing the titles of the graphs appropriately.

< Execute your commands. >

Comment on the histograms:

< Here type in your comment. You are likely to see the density histograms better reflect the Bernoulli(.5) distribution as the number of trials (rows) increases. >

< Now you are done with the data in C1-C3. It is probably safest to simply delete it: highlight the column headings C1-C3 in the data window, this will highlight the columns, then hit the delete key. Likewise, later you will have to delete or overwrite to avoid exceding Student Minitab's 5000 cell limit; I won't mention this again. >

Part II: Uniform [0,1]

We will look at examples in which we see the probability histograms of sample data from continuous distributions looking more like the graphs of the corresponding density functions as the sample size increases.

First, in this Part II, we look at uniform distribution on the unit interval. The density function for this distribution has the following definition:

< Type in a definition for this density function. >

Minitab again can generate sample numbers representative of this distribution. (We can think of the Minitab Crew as repeatedly picking numbers from [0,1] and writing them down. The picks are completely accidental, no more likely from one location than another.)

We put into columns C1, C2, C3 sample data from the) uniform distribution. We use 50, 500 and 4400 rows respectively for these columns. We do this just as we did in Step I, but choosing Uniform instead of Bernoulli.

< Execute your commands. >

Next we generate density histograms for these data.

< Execute your commands. Of course the title of the graphs should change. For example, let the first graph be

Uniform Distribution on [0,1], 50 Trials >

Comment on these density histograms:

< Type in your comments. It is likely that the tops of the rectangles will be more the same as the number of trials increases -- that is, the discrete distribution described by the probability histogram will be better approximated by the uniform distribution on the unit interval [0,1].

There will probably be two exceptional rectangles, the leftmost and rightmost, for many trials. Explain in your comments why these are only half as high as the others. >

Part III: The Standard Normal Distribution

The standard normal distribution is the normal distribution with mean 0 and standard deviation 1; its density function is the standard bell shaped cuve. In this exercise, we do for the normal distribution what we just did with the uniform distribution (except we don't type in the definition of the density function for the normal distribution). The procedure is the same (even the numbers of trials for columns C1,C2,C3), with "normal" replacing "uniform".

< Execute your commands and give your comments. If Minitab is doing its job, then the probability histograms from the sample data should look pretty close to the region under a bell shaped curve when the sample size is large. >

< Of course the title of the graphs should change. For example, let the first graph be titled Normal Distribution, 50 Trials >

Part IV: The binomial distribution

For the binomial(n,p) distribution with n=10 and p=.5, we generate sample numbers in columns C1,C2,C3 respectively with 50, 500 and 4000 rows; and we generate probability histograms for these three data sets.

Execute your commands. This goes essentially as before. You follow

Calc > Random Data > Binomial

and then you can see how to set the parameters corresponding to n=10 and p=.5 , and so on. Let your first graph title be Binomial (n=10,p=.5) Distribution, 50 trials

Finally, repeate the exercise above for binomial(n,p) with n=10 and p=.1 . Include in your comment a comparison to the previous binomial case, if you notice a difference.

Go back over your session window to erase unneeded errors and garbage, print it out, staple it to your correctly ordered and titled graphs, and hand this in.