Homework 14, Due Wednesday March 12, 2008. =========================================== The purpose of this exercise is to verify "experimentally" the behavior of Maximum Likelihood Estimation in a simple setting. Generate once and for all a fixed vector of 100 (continuously distributed) X values from your favorite distribution, in vector "xvec". Next fix scalar "parameter" values a,b in such a way that a + b*xvec has mean value near 0 and interquartile range roughly from -1 to 1. Then generate 1000 batches of binary Y data values according the the "logistic regression" model Y_i = Binom(1, plogis(a + b * X_i)) i=1,..,100 In your simulation, each batch of data (X_i,Y_i): i=1,..,100 is to be used to calculate MLE's a^ and b^ , using either your own likelihood maximization routine in R or the function "glm". Exercise task is concluded by checking that the resulting sets of 1000 a^ values and 1000 b^ values have approximately the distribution predicted by MLE theory. But 100 is not a large sample for this kind of problem: do not be too surprised if there are noticeable discrepancies between the actual empirical distributions based on the 1000 simulation iterations versus the predicted normal distribution from MLE theory. Once you have your code running for this example, you might want to re-run it with sample sizes 50 (where MLE theory will almost certainly fail) and 400 (where MLE theory should be clearly confirmed by your simulation). YOU ARE NOT REQUIRED TO RE-RUN THE CODE AT THESE ALTERNATIVE SAMPLE SIZES: BUT YOU MAY FIND IT USEFUL IN DECIDING HOW TO INTERPRET YOUR RESULTS FOR SAMPLE SIZE OF 100.