Class Log on Misspecified Model MLE =================================== 10/16/17 Example: ## X_i ~ dlogis(x,1,sqrt(3)/pi) ; estimate parameters by Xbar, S^2 *(n-1)/n ie by Normal dist MLE > Xdat = rlogis(500,1,sqrt(3)/pi) ### true mu, sig = 1 > parm = c(mean(Xdat), var(Xdat)*(499/500)) > parm [1] 1.0792431 0.9751762 ## Estimators are consistent but `empirical' Fisher info matrix is obtained numerically ## from Hessian of optimization: > misfit =optim(parm, function(thet,X) sum((X-thet[1])^2)/(2*thet[2]) + (length(X)/2)*log(thet[2]), X=Xdat, control=list(reltol=1e-10), hessian=T) $par [1] 1.0792427 0.9751692 $value [1] 243.7157 $counts function gradient 59 NA $convergence [1] 0 $message NULL $hessian [,1] [,2] [1,] 5.127315e+02 2.312959e-04 [2,] 2.312959e-04 2.628991e+02 ### CONCENTRATE ON THIS `EMPIRICAL' INFORMATION MATRIX > Amat = misfit$hess/500 [,1] [,2] [1,] 1.025463e+00 4.625917e-07 [2,] 4.625917e-07 5.257982e-01 > gNvec = cbind((Xdat-parm[1])/parm[2], (-1/parm[2]+ (Xdat-parm[1])^2/parm[2]^2)/2) > Bmat = 0.002*(t(gNvec) %*% gNvec) [,1] [,2] [1,] 1.00000000 -0.08165924 [2,] -0.08165924 0.82493053 ### NB exact expectations given by > integrate(function(x) (x-1)^2 * dlogis(x,1,sqrt(3)/pi), -Inf,Inf) 1 with absolute error < 6.9e-07 > integrate(function(x) (x-1)*((x-1)^2-1) * dlogis(x,1,sqrt(3)/pi), -Inf,Inf) -3.082756e-12 with absolute error < 2e-07 > integrate(function(x) ((x-1)^2-1)^2 * dlogis(x,1,sqrt(3)/pi)/4, -Inf,Inf) 0.8 with absolute error < 1.5e-05 ### NOW THINK OF HOW WE WOULD REPORT LARGE-SAMPLE VARIANCES OF ML ESTIMATORS ### IF we thought data came from N(mu,sigsq) we would report solve(Amat) > solve(Amat) [,1] [,2] [1,] 9.751692e-01 -8.579437e-07 [2,] -8.579437e-07 1.901870e+00 ### But the actual empirical variance (* 500) is: 1/sigsq = 1 for Xbar, ### and for S^2 is approximately > integrate( function(x) ((x-1)^4-1)*dlogis(x,1,sqrt(3)/pi), -Inf,Inf) 3.2 with absolute error < 5.4e-05 ## CONTRAST: > integrate( function(x) ((x-1)^4-1)*dnorm(x,1,1), -Inf,Inf) 2 with absolute error < 5.6e-05 #### In this setting, the "robustified" variance (*500) is > solve(Amat) %*% Bmat %*% solve(Amat) [,1] [,2] [1,] 0.9509551 -0.1514511 [2,] -0.1514511 2.9838656 ### MAIN POINT IS THAT LOWER-RIGHT ELEMENT IS MUCH CLOSER TO 3.2 !!! ### FURTHER STEPS WE COULD DO TO AMPLIFY THE EXERCISE ARE: (1) Simulate many batches of n logistic variates and estimate parameters as though normal and verify that the actual asymptotic (large-n) variances are well predicted by the robustified "Sandwich formula". (2) Re-do the steps in other examples of true f versus working g densities, to see the variety of examples where ML estimators from wrong model can continue to be very bad, but with variances WELL estimated by robustified formula. HW11 due Monday 10/23 will do steps (1) and (2) in another example.