DATA ANALYSIS LOG IN  R  USING  LM with posterior 
  and "parametric-bootstrap" sampling
==============================================  11/2/09

### Use example found in Exercise 14, with bass data fitted 
###   according to selected model (with no omitted data pts):
##          logAvM ~ Alk + Chlor

> auxfit
Call:
lm(formula = logAvM ~ Alk + Chlor, data = bass)

Coefficients:
(Intercept)          Alk        Chlor  
   -0.20927     -0.01073     -0.01291 

### Let's start with new dataset sampled according to 
### SAME MODEL, WITH SAME MODEL-MATRIX using fitted parameters.

> newY = auxfit$fit + rnorm(53, sd=summary(auxfit)$sig)
  plot(bass$Alk, newY, pch=5, xlab="Alk", ylab="logAvM",
        main="True and Pseudo Data")
  points(bass$Alk, bass$logAvM, pch=18)

### Can do this again, for comparison
  newY2 = auxfit$fit + rnorm(53, sd=summary(auxfit)$sig)
  plot(bass$Alk, newY2, pch=5, xlab="Alk", ylab="logAvM",
        main="True and Pseudo Data")
  points(bass$Alk, bass$logAvM, pch=18)

### Not too bad !? You should try these pictures for yourself:
###  the black diamonds are the observed data-points, and the 
###  hollow ones are the parametric-bootstrap simulated ones.
###  They look rather similar in overall pattern.
#---------------------------------------------------------

### Another task would be to do the whole thing Bayesianly, 
###    starting from prior density for parameters. Note from 
###    conjugate-prior handout that the 3 coef's would be 
###    given normal prior. Next 1/sigma^2 given Gamma prior.
###        Finally the posterior density is calculated, and 
###    if desired a "posterior-predictive" sample is generated.

> summary(bass$logAvM)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-3.2190 -1.3090 -0.7340 -0.9104 -0.2614  0.2852 

> rho = rgamma(1, 1/2, 1/2)
###  beta coeffs given Normal with mean 0, var matrix 
###        diag(4,1,1), mean = c(-1,0,0)
###  a=lambda=1 in prior for rho

## Now take mu0 = c(-1,0,0), Sig0 = diag(4,1,1)
## The bass data are summarized in the sufficient statistics

> XtX = solve(summary(auxfit)$cov.unsc)
  sumY2 = sum(bass$logAvM^2)
  sumXY = c(t(model.matrix(auxfit))%*%bass$logAvM)
  Tv = sumXY + c(-.25,0,0)

### Continue by creating a function providing the posterior density
###   of rho = 1/sigma^2 based on this "bass" model fit "auxfit"
### See the BayesConug.pdf handout, p. 3 middle, first two lines,
###   for the expression in rho and the sufficient statistics which 
###   is proportional to the posterior conditional density of rho.

> PostDens = function(rho) {
   ### unnormalized posterior density of rho given data in auxfit
   outd = numeric(length(rho))
   for(i in 1:length(rho)) {
       M = XtX/rho[i] + diag(c(.25,1,1))
       outd[i] = rho[i]^((53+1)/2-1)*exp(-(rho[i]/2)*(1+sumY2)+
           0.5*c(t(Tv)%*%solve(M,Tv)))/sqrt(det(M)) }
   outd
   }

### However, we still need to find the normalizing constant for this
###    posterior rho-density via numerical integration:

> integrate(PostDens,0,Inf)$val
[1] 2.813776
> c.norm = 1/.Last.value

#### Now create a posterior density plot for rho/n
> plot(0,0, xlab="rho",ylab="density", xlim = c(0,8), 
       ylim=c(0,.6), type="n", main="Posterior Density (rho)")
  curve((c.norm)*PostDens(x), add=T)
  abline(v=1/summary(auxfit)$sig^2, lty=3)
### Not a very precise density for rho, but it is approximately normal
> m1=integrate(function(rho) rho*PostDens(rho),0,Inf)$val*c.norm
  m2=integrate(function(rho) rho^2*PostDens(rho),0,Inf)$val*c.norm
> c(m1,sqrt(m2-m1^2))
[1] 4.5286339 0.8449373
> curve(dnorm(x,m1,sqrt(m2-m1^2)), add=T, lty=5, col="blue")
## Fairly normal looking but slightly skewed (heavier tail to right): 
##  contrast posterior mean 4.529 versus previous 
##  ML rho estimator 1/summary(auxfit)$sig^2 = 4.319

## Can look at beta coefficient estimators as follows.
## Recall that the density given directly in the handout
##  is not for the coefficients bj themselves but bj*rho

> b0dens = function(b) {
     outd = numeric(length(b))
     for(i in 1:length(b)) {
       outd[i] = integrate(function(rho,.b) {
         tmp = numeric(length(rho))
         for(j in 1:length(rho)) {
           V = solve(XtX/rho[j] + diag(c(.25,1,1)))
           tmp[j] = PostDens(rho[j])*dnorm(.b*rho[j],
                 c(V[1,]%*%Tv),sqrt(V[1,1]))*rho[j] }
         tmp },0,Inf,.b=b[i])$val
       }
      c.norm*outd }
 
### We are using the normal distribution for nu given rho, Y;
###   as given in this density, we find the posterior density 
###   for  c(1,0,0)'b ,i.e., for the Intercept coefficient.
### Note that because the density of the intercept coefficient 
###   itself is coniditional on rho and the data, we have to do 
###   a numerical integral of the (posterior) density of rho 
###   times the conditional density of b[1] given rho and the 
###   data in order to integrate out rho.

> b0dens(-0.2)
[1] 4.33045         ## This is the density value at a particular point
               ### which is close to the ML fitted value -0.21

> plot(0,0, xlab="b0 coef", ylab="density",
   main="Posterior Density (b0)", type="n",
   xlim=c(-.5,.1), ylim=c(0,5))
  curve(b0dens(x), add=T)  
### Looks fairly normal, but again not extremely precise ...

### Let's do one more posterior calculation: the posterior 
###   density for the observation #38 previously treated as "outlier"

> x38 = model.matrix(auxfit)[38,]
> x38
(Intercept)         Alk       Chlor 
        1.0        53.0       152.4 
> summary(bass$Chlor)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.70    4.60   12.80   23.12   24.70  152.40 
> summary(bass$Alk)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.20    6.60   19.60   37.53   66.50  128.00 

### These summaries tell us that observation 38 had the very largest
    value of Chlor and an above-average value of Alk.

### The following posterior density for (the fitted value of) y38
###   involves integrating out rho, noting that the variance for this
### fitted value is x38 %*% V %*% x38, where V is the posterior
### variance given on the handout for x38 %*% b.

> y38dens = function(y) {
     outd = numeric(length(y))
     for(i in 1:length(y)) {
       outd[i] = integrate(function(rho,.y) {
         tmp = numeric(length(rho))
         for(j in 1:length(rho)) {
           V = solve(XtX/rho[j] + diag(c(.25,1,1)))
           tmp[j] = PostDens(rho[j])*dnorm(.y*rho[j],
                 c(x38 %*% V %*%Tv),
              sqrt(x38 %*% V %*% x38))*rho[j] }
         tmp },0,Inf,.y=y[i])$val
       }
      c.norm*outd }

> bass$logAvM[38]
[1] -3.218876           ### this was the actual y value for obs 38
> y38dens(-3.2)
[1] 0.4292616           ### this is the posterior desnsity at -3.2      

> plot(0,0, xlab="fit38 value", ylab="density",
   main="Posterior Density (fit38)", type="n",
   xlim=c(-4,-1), ylim=c(0,1.5))
  curve(y38dens(x), add=T)  
  abline(v=auxfit$fit[38], col="red")
> abline(v=bass$logAvM[38], col="blue")
> integrate(y38dens, -Inf, bass$logAvM[38])$val    ### this is the posterior
[1] 0.06283094               ## prob density of a fitted value y38 <= -3.22

###   So what previously seemed like an outlier seems 
###    much less like one in the Bayesian analysis.

### This is a posterior "predictive" result, saying that 
###   the fitted value for the index 38 observation as a 
###   fcn of the parameters has probability .063 of being < -3.22.

### A more typical posterior predictive result would be 
###   to look at the distribution not only of the fitted y38
###   but of a newly generated one, y38*, with the same x38.
###   This simply, for given rho and other parameters, adds
###   variance 1/rho to that for the fitted term.

> y38new.dens = function(y) {
     outd = numeric(length(y))
     for(i in 1:length(y)) {
       outd[i] = integrate(function(rho,.y) {
         tmp = numeric(length(rho))
         for(j in 1:length(rho)) {
           V = solve(XtX/rho[j] + diag(c(.25,1,1)))
           tmp[j] = PostDens(rho[j])*dnorm(.y,
                 c(x38 %*% V %*%Tv)/rho[j],
              sqrt(1/rho[j]+(x38 %*% V %*% x38)/rho[j]^2)) }
         tmp },0,Inf,.y=y[i])$val
       }
      c.norm*outd }

> plot(0,0, xlab="y38 value", ylab="density",
   main="Posterior Predictive Dens (y38)", type="n",
   xlim=c(-5,0), ylim=c(0,1))
  curve(y38new.dens(x), add=T)  
  abline(v=auxfit$fit[38], col="red")
  abline(v=bass$logAvM[38], col="blue")

### Density somewhat more spread out: now probability
###  associated with y38* < observed value is

> integrate(y38new.dens, -Inf, bass$logAvM[38])$val
[1] 0.2006840

###================================================
NOW ADD A FINAL STEP COMBINING THE BAYES POSTERIOR AND
POSTERIOR-PREDICTIVE SIMULATION
###==================================================

We return to the computing formulas on the 4th page of the 
BayesConjug.pdf Lecture Notes file to create a posterior
predictive Bayes sample which could be compared either with 
the original "bass" dataset or with the parametric-Boostrap
simulation does at the beginning of this Log.

The first step is to obtain a convenient cumulative 
distribution function and inverse using the smoothing-spline 
idea we presented in class and in   Sec5NotF09.pdf.

Recall that the posterior density for rho was coded in PostDens
(but must be multiplied by normalizing constant c.norm) and had 
substantially all of its values on the interval (0.1,10)

> c.norm*c(PostDens(.1), PostDens(10))
[1] 1.537705e-34 1.308320e-06

> rhovals = seq(.1,10, by=.1)
  Frhovals = numeric(100)
  for(i in 1:100) Frhovals[i] = integrate(PostDens, c(0,i/10))$val
  Frhovals = Frhovals*c.norm
  auxspl = smooth.spline(Frhovals,rhovals, spar=.3)  

  Frhinv = function(uvec) predict(auxspl, uvec)$y
> plot(0,0, xlab="", ylab="", xlim=c(1e-12,1-1e-5), ylim=c(0.1,10), type="n")
> curve(Frhinv(x), add=T)
 #### good except in the far tails

### Now we do a posterior predictive sample and plot it in the same way
### that we did the parametric bootstrap samples above:

> library(MASS)   ### so that we use  mvrnorm to generate multivariate normal
> rho.Bay = Frhinv(runif(1))
  auxmat = solve(XtX + rho.Bay*diag(c(.25,1,1)))
  beta.Bay = mvrnorm(1, auxmat %*% Tv, (1/rho.Bay)*auxmat)

> c(sigsq=1/rho.Bay, beta=beta.Bay)
           sigsq beta.(Intercept)         beta.Alk       beta.Chlor 
      0.22665317      -0.24867577      -0.01077671      -0.01259073 
> c(summary(auxfit)$sig^2, auxfit$coef)
            (Intercept)         Alk       Chlor 
 0.23153217 -0.20927485 -0.01072919 -0.01291183  ## old MLE's for comparison

> newY.Bayes = c(model.matrix(auxfit) %*% beta.Bay) + rnorm(53,0,sqrt(1/rho.Bay))
  plot(bass$Alk, newY.Bayes, pch=5, xlab="Alk", ylab="logAvM",
        main="True and Posterior Predictive Data")
  points(bass$Alk, bass$logAvM, pch=18)
### In this plots, the posterior simulated  beta, rho play a dominant role;
###   we must simulate new posterior  beta, rho  to get a real picture of 
###   the variety of posterior predictive pictures !!