Homework Set 8 Solution.                     2/21/08
========================

(8) (a) The point is that, in using the "bivariate probability
    integral transform" method discussed in class, you can work
    either with F_X and F_{Y|X} or with F_Y and F_{X|Y}. If the
    objective is to do everything analytically, then F_Y and F_{X|Y}
    works better, so that will be the way I do it first (Method 1).
    The other way works too, but uses a little more in the way of
    Splus trickery because F_X, while simple, cannot be analytically
    inverted. That will be Method 2, and I use the two methods to
    check each other.

## First we exhibit the marginal and conditional df's.
> Fx = function(x) 1 - 1.5/x^2 + .5/x^3
  Fy = function(y) 1-1.5/y + .5/y^2
  Finvy = function(r) .75*(1 + sqrt(1-8*(1-r)/9))/(1-r)
  FYcondX = function(x,y) ifelse(y <= x,
    (y-1)/(2*x-1), 1 - x^2/(y*(2*x-1)))
  FXcondY = function(x,y) ifelse(x <= y,
    1.5*(1-1/x)/(1.5-1/y), 1 - .5*(y/x)^3/(1.5*y-1))
  FYinvcondX = function(r,x) ifelse(r <= FYcondX(x,x),
    (2*x-1)*r+1, x^2/((2*x-1)*(1-r)))
  FXinvcondY = function(r,y) ifelse(r <= FXcondY(y,y),
    1/(1-r*(1-2/(3*y))), y*(.5/((1-r)*(1.5*y-1)))^(1/3))

## NB we can test these inverses simply, eg by generating
##   arbitrary X values on (1,infty) and R on (0,1) and
##   then making sure that FYcondX(FYinvcondX(R,X),X)=R
> Xtmp = exp(rexp(100)); Rtmp <- runif(100)

> sum(abs(FYcondX(Xtmp, FYinvcondX(Rtmp,Xtmp))-Rtmp))
[1] 1.259409e-15    ### this is =0, OK
> sum(abs(FXcondY(FXinvcondY(Rtmp,Xtmp),Xtmp)-Rtmp))
[1] 5.972653e-15    ### OK again
> sum(abs(Fy(Finvy(Rtmp))-Rtmp))
[1] 2.664535e-15    ### OK again

METHOD 1.
> Ytmp = Finvy(runif(1000))
  Xtmp = FXinvcondY(runif(1000),Ytmp)
### We display the histogram of log(Xtmp) versus
###    the theoretical density exp(z)*fx(exp(z))
> hist(log(Xtmp), nclass=36, prob=T, main=
     "Scaled Histogram of log(X)")
> lxp = (1:400)/100
  lines(lxp, 3*exp(-2*lxp)-1.5*exp(-3*lxp), lty=3)
### Virtually perfect agreement
> hist(log(Ytmp), nclass=36, prob=T, main=
     "Scaled Histogram of log(Y)")
  lines((1:400)/100 -> lxp, 1.5*exp(-lxp)-exp(-2*lxp),
     lty=3)   ### Again very close agreement

METHOD 2.
> xpt = exp((0:400)/80)
> Xtmp2 = predict(smooth.spline(Fx(xpt), xpt,
      spar=1.e-6), runif(1000))$y
  Ytmp2 = FYinvcondX(runif(1000), Xtmp2)


(b) The region of interest is the set in the place with
x-coordinates ranging from -5 to 5, and with y ranging from
0 to min((10 +2x)/3, 5 - x). It is easy to see that the density
of X is not uniform but rather is proportional to the height
(max y-coordinate) of the triangle for each value x in [-5,5].
The constant of proportionality is the reciprocal of the
area of the triangle, ie fX(x) = .05*min((10+2x)/3, 5 - x).
The corresponding df is FX(x) = (.05/3)*(x+5)^2 for x<1,
and = 1 - .025*(5-x)^2  for 1<= x < 5.

The rejection method is easier to program. Note that since we
should reject about half of the random numbers, to get 1000 pairs it
should suffice to generate 2200. (To check this, note that
> pbinom(2000,2200,.5)  ###  = 1
More generally, we propose to generate 400 + 2.2*N pairs.

> XYfcn3 = function(N) {
    naux = ceiling(400+2.2*N)
    Xmat = cbind(-5+10*runif(naux), 4*runif(naux))
    inds = (1:naux)[(Xmat[,1] < 1 & Xmat[,2] < 10/3+ (2/3)*Xmat[,1]) |
         (Xmat[,1] >= 1 & Xmat[,2] < 5-Xmat[,1])]
    list(Xval=Xmat[inds[1:N],1], Yval=Xmat[inds[1:N],2])
  }
  tmp3 <- XYfcn3(1000)

> sum(((c(table((tmp3$Xval+1) %/% 2 , (tmp3$Yval %/% 2))) -
       c(ea))^2/c(ea))[c(ea)>0]) ### now chisq = 5.727582

### Various methods can be used to check the uniformity in
### the triangle, but please take off some credit (maybe .5
or 1 point) if the student does nothing beyond looking at
a scatterplot.