2009 LOG TO ILLUSTRATE IMPORTANCE SAMPLING ========================================== 9/30/09 > 1-pgamma(20,10) [1] 0.004995412 ### This is the probability for a sum of 10 iid Expon(1) ### r.v.'s to exceed 20, which we will simulate. > unix.time({ tmp = array(rexp(2e6), dim=c(2e5,10)) rsum = c(tmp %*% rep(1,10)) pfrac = mean(rsum > 20) cat("pfrac = ",pfrac,"\n") } ) pfrac = 0.005115 user system elapsed 0.590 0.027 0.633 ### 95% CI width = 1.96*sqrt(pfrac*(1-pfrac)/2e5) > 1.96*sqrt(pfrac*(1-pfrac)/2e5) [1] 0.0003126442 ### NOTE: count of positive values 2e5*pfrac = 1023 ### NEXT COMES THE IMPORTANCE-SAMPLING VERSION ... > unix.time({ tmp = array(rexp(2e6)*2, dim=c(2e5,10)) rsum = c(tmp %*% rep(1,10)) cond = (rsum > 20) count = sum(cond) phat = mean( (2^(10)*exp(-0.5*rsum)*cond)) cat("phat = ",mean(phat),"\n","count = ",count, "\n") } ) phat = 0.004988447 count = 91206 ### this is out of 2e5 trials user system elapsed 0.635 0.058 0.697 ### Now the variance to be compared with .0051 is: > mean(2^(20)*exp(-rsum)*cond)-pfrac^2 [1] 0.0001004916 ### and the 95% CI width is: > 1.96*sqrt(0.0001004916/2e5) [1] 4.393453e-05 ### This is a 10-fold improvement in precision #### for very little extra computation time ### Next calculate the theoretical variance of the variates being averaged in the importance-sampling simulation > 2^20*integrate(function(z) exp(-z)* dgamma(z,10,scale=2), 20,Inf)$val - pfrac^2 [1] 0.0001003014 ######## Compare with empirical estimate 0.0001004916 above. NOW TRY A LITTLE MORE EXPERIMENTATION: CAN WE IMPROVE FURTHER BY MAKING REJECTION STILL MORE PROBABLE ? > unix.time({ tmp = array(rexp(2e6)*3, dim=c(2e5,10)) rsum = c(tmp %*% rep(1,10)) cond = (rsum > 20) count = sum(cond) phat = mean( (3^(10)*exp(-0.66666667*rsum)*cond)) cat("phat = ",mean(phat),"\n","count = ",count, "\n") } ) phat = 0.004946334 count = 172429 ### now most replications give non-0 user system elapsed 0.715 0.083 1.497 ## New variance: > 3^20*integrate(function(z) exp(-(4/3)*z)* dgamma(z,10,scale=3), 20,Inf)$val - pfrac^2 [1] 0.0001993289 ### still good, but not better than the value ### found with the previous (Expon(1/2)) importance-sampling sampling