Homework 14, Due Wednesday March 12, 2008.
===========================================

The purpose of this exercise is to verify 
"experimentally" the behavior of Maximum 
Likelihood Estimation in a simple setting.

Generate once and for all a fixed vector
of 100 (continuously distributed) X values 
from your favorite distribution, in vector 
"xvec".

Next fix scalar "parameter" values a,b in 
such a way that  a + b*xvec has mean value 
near 0 and interquartile range roughly 
from -1 to 1.

Then generate 1000 batches of binary 
Y data values according the the "logistic 
regression" model

Y_i = Binom(1, plogis(a + b * X_i))
i=1,..,100

In your simulation, each batch of data 
(X_i,Y_i): i=1,..,100 is to be used to 
calculate MLE's a^ and b^ , using either
your own likelihood maximization routine
in R or the function "glm".

Exercise task is concluded by checking that 
the resulting sets of 1000 a^ values and 
1000 b^ values have approximately the 
distribution predicted by MLE theory. 

But 100 is not a large sample for this kind 
of problem: do not be too surprised if there 
are noticeable discrepancies between the 
actual empirical distributions based on 
the 1000 simulation iterations versus the 
predicted normal distribution from MLE theory.

Once you have your code running for this 
example, you might want to re-run it with 
sample sizes 50 (where MLE theory will almost 
certainly fail) and 400 (where MLE theory 
should be clearly confirmed by your simulation).
YOU ARE NOT REQUIRED TO RE-RUN THE CODE AT THESE
ALTERNATIVE SAMPLE SIZES: BUT YOU MAY FIND IT 
USEFUL IN DECIDING HOW TO INTERPRET YOUR 
RESULTS FOR SAMPLE SIZE OF 100.