Solution to HW 24.
=================

> bankN = read.table("~/public_html/s798c/Data/bank2.dat")[,4]
> dens1a = locpoly(bankN[1:100], degree=1, band=.3)
  dens1b = locpoly(bankN[1:100], degree=2, band=.3)
  dens2a = locpoly(bankN[101:200], degree=1, band=.3)
  dens2b = locpoly(bankN[101:200], degree=2, band=.3)
  plot(density(bankN[1:100], bw=bw.SJ(bankN[1:100])), lty=1,
     xlab="Lower Margin mm", ylab="Density", xlim=c(6.7, 13.3),
    main= "Density of Swiss Banknote Lower Margins")
  lines(density(bankN[101:200], bw=bw.SJ(bankN[101:200])), lty=2)
> lines(dens1a, lty=3)
  lines(dens1b, lty=4)
  lines(dens2a, lty=6)
  lines(dens2b, lty=8)
  legend(locator(), legend=c("Real.def","Real.lpdeg1", 
     "Real.lpdeg2", "Forg.def","Forg.lpdeg1", "Forg.lpdeg2"),
     lty = c(1,3,4, 2, 6, 8))
> c(BW1 = bw.SJ(bankN[1:100]), BW2 = bw.SJ(bankN[101:200]))
      BW1       BW2
0.2667486 0.3110543

### (2) In each of the Real and Forged subsets, the degree 1 local
###  polynomials gives density estimates incredibly close to the
###  (degree 0) kernel density estimates, while the degree 2 
###  estimates definitely have bumpier structure, much closer to
###  the kernel density with bw=.2.

## (3) Here a "crude" estimate is 
> mean(outer(bankN[1:100],bankN[101:200], function(x,y) x<y))
[1] 0.9386
## It may in fact be the best estimate: it is certainly the one which
makes least assumptions. If we want to benefit from some assumed
smoothness in the density, let's estimate using the formula 

integral F(x)g(x) dx , where F is the c.d.f of the real-note 
measurement and g is the density of the forged-note measurement.
First get a spline-smoothed version of F from the estimated density.
> dens1c = locpoly(bankN[1:100], degree=2, band=.3, range=c(6.5,11.5))
 dens2c = locpoly(bankN[101:200], degree=2, band=.3, range=c(6,14), 
         gridsize=501 )
 Fvals = numeric(401)
 Fvals[2:401] = ((11.5-6.5)/400)*.5*cumsum(dens1c[[2]][1:400]+
          dens1c[[2]][2:401])
> summary(Fvals)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
-0.002311  0.221100  0.847400  0.641100  0.995400  1.003000

> tmpspl = smooth.spline(dens1c[[1]], Fvals, spar=1.e-6)
  Fcdf = function(xvec) pmax(0,pmin(1,predict(tmpspl, xvec)$y))
  tmpspl2 = smooth.spline(dens2c[[1]],
         log(pmax(dens2c[[2]],1.e-5)), spar=1.e-6)
  gdens = function(yvec) exp(predict(tmpspl2, yvec)$y)
> summary(gdens(dens2c[[1]]))
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
7.785e-06 3.051e-03 5.688e-02 1.249e-01 2.418e-01 4.137e-01

> integrate(function(x) Fcdf(x)*gdens(x), 6.5, 13.5, abs.tol=1e-6)$value
[1] 0.9435191

### There is no real reason to expect that this `smoothed' estimator
###  is any better than the crude unbiased estimator .9386 above.