Log Related to Score and Profile Likelihood Confidence Intervals
================================================== 3/3/03 ff ===

This log concerns the Score and Profile-likelihood confidence
intervals for association parameters in  2x2  tables. The parameters
theta ,  expressed in terms of the row-binomial success probabilities
p1, p2  (with corresponding row-totals n1, n2), are:

Risk Difference  =  p1-p2
Risk-Ratio       =  p1/p2  (also called relative risk)
Odds-Ratio       = (p1(1-p2))/((1-p1)p2)

We usually think of the sampling framework as being indepedent
binomial within rows, and this is the assumption made in the following
Splus code, but in fact the same asymptotic justifications of the
resulting Confidence Intervals apply when the sampling is multinomial,
with only the table-total  n  nonrandom. 

A particular example we use for illustration is the Sweidsh Aspirin/MI
study which had the 2x2 table data

> tabnums
     [,1] [,2] 
[1,]   28  656      Aspirin
[2,]   18  658      No aspirin
      MI   no MI

In this Log, we contrast 3 confidence intervals:
(i) The Wald interval, based directly on approximate-normal sampling 
distribution for the MLE of theta.
(ii) The Score interval, obtained by inverting the Score hypothesis
test for the parameter to have a specified point value. 
(iii) The Profile Likelihood based interval, obtained by inverting the
Wilks Generalized Likelihood Ratio test.

The Splus (version 3.4) functions for calculating these intervals,
along with the restricted MLE's for fixed values of theta, are:

NuisBinD
NuisBinRR
NuisBinOR

wlistings of which are given at the end of this Log. The Splus objects
themselves are available on MathNet at

/usr/local/StatData/stat770/.Data

In addition, the function ScorProfW to pull these CI calculations
together for any of the three parameter choices is available (and
listed below.

We now illustrate on the Aspirin/MI data above.

First, for the Difference parameter:

> round(NuisBinD(tabnums,.01),4)    ## Stat values for theta=.01
   rmle Scorstat LRstat   logLik 
 0.0385    0.442  0.195 -199.923

Now we list these outputs for enough values in order to see 
where the Test-based CI's lie. 95% Interval endpoints for Score
are the theta values for which the second column has values 1.96
and -1.96, and for Profile are theta values where the third column
crosses the level 3.84.

> for(i in 0:25) cat(round(NuisBinD(tabnums,-.01 + i*.002),4),
      "for theta=",-.01 + i*.002,"\n")
0.0305 2.4221 6.0742 -199.923 for theta= -0.01 
0.0311 2.2358 5.1432 -199.923 for theta= -0.008 
0.0317 2.0462 4.2825 -199.923 for theta= -0.006 
0.0323 1.8534 3.4947 -199.923 for theta= -0.004 
0.0331 1.6577 2.7823 -199.923 for theta= -0.002 
0.0338 1.4594 2.1474 -199.923 for theta= 0 
0.0346 1.2589 1.592 -199.923 for theta= 0.002 
0.0355 1.0564 1.1179 -199.923 for theta= 0.004 
0.0364 0.8524 0.7264 -199.923 for theta= 0.006 
0.0374 0.6474 0.4186 -199.923 for theta= 0.008 
0.0385 0.442 0.195 -199.923 for theta= 0.01 
0.0396 0.2365 0.0559 -199.923 for theta= 0.012 
0.0408 0.0315 0.001 -199.923 for theta= 0.014 
0.042 -0.1725 0.0298 -199.923 for theta= 0.016 
0.0432 -0.3751 0.1413 -199.923 for theta= 0.018 
0.0446 -0.5758 0.3341 -199.923 for theta= 0.02 
0.0459 -0.7745 0.6068 -199.923 for theta= 0.022 
0.0473 -0.9707 0.9574 -199.923 for theta= 0.024 
0.0488 -1.1643 1.3839 -199.923 for theta= 0.026 
0.0503 -1.355 1.8842 -199.923 for theta= 0.028 
0.0518 -1.5428 2.456 -199.923 for theta= 0.03 
0.0534 -1.7275 3.0969 -199.923 for theta= 0.032 
0.055 -1.909 3.8046 -199.923 for theta= 0.034 
0.0566 -2.0875 4.5768 -199.923 for theta= 0.036 
0.0583 -2.2628 5.411 -199.923 for theta= 0.038 
0.06 -2.435 6.305 -199.923 for theta= 0.04 

> for(i in 0:25) cat(round(NuisBinRR(tabnums, .85+i*.08),4),
     "for theta=",.85+i*.08,"\n")
0.0311 2.0255 4.0979 -199.923 for theta= 0.85 
0.0326 1.7107 2.9408 -199.923 for theta= 0.93 
0.034 1.4251 2.0483 -199.923 for theta= 1.01 
0.0353 1.1636 1.3679 -199.923 for theta= 1.09 
0.0365 0.9221 0.8594 -199.923 for theta= 1.17 
0.0376 0.6978 0.4917 -199.923 for theta= 1.25 
0.0386 0.4881 0.2401 -199.923 for theta= 1.33 
0.0395 0.2911 0.0852 -199.923 for theta= 1.41 
0.0404 0.1053 0.0111 -199.923 for theta= 1.49 
0.0413 -0.0707 0.005 -199.923 for theta= 1.57 
0.042 -0.2379 0.0563 -199.923 for theta= 1.65 
0.0428 -0.3974 0.1563 -199.923 for theta= 1.73 
0.0435 -0.5498 0.2976 -199.923 for theta= 1.81 
0.0441 -0.6958 0.4743 -199.923 for theta= 1.89 
0.0447 -0.8361 0.6812 -199.923 for theta= 1.97 
0.0453 -0.9711 0.9138 -199.923 for theta= 2.05 
0.0459 -1.1013 1.1686 -199.923 for theta= 2.13 
0.0464 -1.227 1.4423 -199.923 for theta= 2.21 
0.0469 -1.3487 1.7322 -199.923 for theta= 2.29 
0.0473 -1.4665 2.0359 -199.923 for theta= 2.37 
0.0478 -1.5809 2.3515 -199.923 for theta= 2.45 
0.0482 -1.692 2.6771 -199.923 for theta= 2.53 
0.0486 -1.8 3.0113 -199.923 for theta= 2.61 
0.049 -1.9052 3.3528 -199.923 for theta= 2.69 
0.0494 -2.0077 3.7002 -199.923 for theta= 2.77 
0.0497 -2.1077 4.0527 -199.923 for theta= 2.85 


> for(i in 0:25) cat(round(NuisBinOR(tabnums, .85+i*.08),4),
     "for theta=",.85+i*.08,"\n")
0.0312 2.007 4.0218 -199.923 for theta= 0.85 
0.0326 1.7025 2.9119 -199.923 for theta= 0.93 
0.034 1.4262 2.0516 -199.923 for theta= 1.01 
0.0352 1.1734 1.3914 -199.923 for theta= 1.09 
0.0364 0.94 0.8933 -199.923 for theta= 1.17 
0.0374 0.7231 0.5283 -199.923 for theta= 1.25 
0.0384 0.5203 0.2731 -199.923 for theta= 1.33 
0.0394 0.3299 0.1095 -199.923 for theta= 1.41 
0.0402 0.1501 0.0226 -199.923 for theta= 1.49 
0.041 -0.0202 0.0004 -199.923 for theta= 1.57 
0.0418 -0.1821 0.033 -199.923 for theta= 1.65 
0.0425 -0.3365 0.1122 -199.923 for theta= 1.73 
0.0432 -0.4841 0.2311 -199.923 for theta= 1.81 
0.0438 -0.6257 0.3841 -199.923 for theta= 1.89 
0.0445 -0.7617 0.5663 -199.923 for theta= 1.97 
0.045 -0.8926 0.7735 -199.923 for theta= 2.05 
0.0456 -1.019 1.0024 -199.923 for theta= 2.13 
0.0461 -1.141 1.2498 -199.923 for theta= 2.21 
0.0466 -1.2592 1.5132 -199.923 for theta= 2.29 
0.0471 -1.3737 1.7903 -199.923 for theta= 2.37 
0.0475 -1.4849 2.0794 -199.923 for theta= 2.45 
0.048 -1.5929 2.3785 -199.923 for theta= 2.53 
0.0484 -1.698 2.6864 -199.923 for theta= 2.61 
0.0488 -1.8004 3.0017 -199.923 for theta= 2.69 
0.0491 -1.9002 3.3234 -199.923 for theta= 2.77 
0.0495 -1.9975 3.6503 -199.923 for theta= 2.85 

=================================================================

To see the intervals themselves, for alpha=.05, .01:

> lapply(ScorProfW(tabnums,alpha=c(.05,.1), thetarange=c(-.015,.05), 
     npts=30)[2:4], function(lelt) round(lelt,5))
$WaldInts:
[1,] -0.00487 0.03349      95% interval in "Difference" case
[2,] -0.00179 0.03040      90%

$ScorInts:
[1,] -0.00510 0.03457      95%
[2,] -0.00187 0.03110      90%

$LRInts:
[1,] -0.00490 0.0341       95%
[2,] -0.00177 0.0308       90%


> lapply(ScorProfW(tabnums,alpha=c(.05,.1), thetarange=c(.5,4), 
     npts=30, type="RR")[2:4], function(lelt) round(lelt,5))
$WaldInts:
[1,] 0.85467 2.84850    95%   "RR" = risk ratio
[2,] 0.94151 2.58576    90%

$ScorInts:
[1,] 0.86598 2.73250    95%
[2,] 0.94780 2.49578    90%

$LRInts:
[1,] 0.86582 2.80218    95%
[2,] 0.94883 2.53688    90%

> lapply(ScorProfW(tabnums,alpha=c(.05,.1), thetarange=c(.5,4), 
     npts=30, type="OR")[2:4], function(lelt) round(lelt,5))
$WaldInts:
[1,] 0.85861 2.75267    95%   "OR" = odds ratio
[2,] 0.94291 2.50658    90%

$ScorInts:
[1,] 0.86182 2.81890    95%
[2,] 0.94608 2.56927    90%

$LRInts:
[1,] 0.86159 2.89626    95%
[2,] 0.94718 2.61491    90%


TRY THESE FUNCTIONS OUT IN OTHER EXAMPLES !
Thus, for a 2x2 table with entries   60  40
				     55  45
find 95% intervals:
> round(unlist(ScorProfW(matrix(c(60,55,40,45), ncol=2),alpha=.05,
     thetarange=c(-.1,.2), npts=20, type="D")[2:4]),4)
 WaldInts1 WaldInts2 ScorInts1 ScorInts2 LRInts1 LRInts2 for "Diff"
   -0.0868    0.1868   -0.0866    0.1848 -0.0868  0.1856
> round(unlist(ScorProfW(matrix(c(60,55,40,45), ncol=2),alpha=.05,
     thetarange=c(.5,1.5), npts=20, type="RR")[2:4]),4)
 WaldInts1 WaldInts2 ScorInts1 ScorInts2 LRInts1 LRInts2 
       0.7    2.1518    0.8587    1.3909   0.857  1.3923
>  round(unlist(ScorProfW(matrix(c(60,55,40,45), ncol=2),alpha=.05,
     thetarange=c(.5,1.5), npts=20, type="OR")[2:4]),4)
 WaldInts1 WaldInts2 ScorInts1 ScorInts2 LRInts1 LRInts2 
    0.8591    1.3852    0.7059    2.0258  0.7003  2.5777

Only in large-sample settings are the Wald-Intervals to be trusted!

======================= Listings follow ==========================

> NuisBinD 
function(tabdat, thet, tol1 = 0.1, tol2 = 0.001)
{
### Coding of restricted ML estimation, along with score, Wald, 
###   and profile-likelihood confidence interval for parameter
###   theta which is risk-difference if type=suffix=D, risk-ratio 
###   if type=RR, and odds-ratio if type=OR.  
### 2x2 table entries are in  tabdat.
### In present set-up, we assume that sampling for types 1 or 2 is
###   row-binomial , and for type 3 is multinomial.
   mle0 <- sum(tabdat[, 1])/sum(tabdat)
   optlst <- nlminb(mle0, function(x, tabl, theta)
 - (tabl[1, 1] * log(x) + tabl[1, 2] * log(1 - x) + tabl[2, 1] * 
      log(x - theta) + tabl[2, 2] * log(1 + theta - x)), lower = 
      pmax(mle0 - tol1, thet) + tol2, upper = pmin(mle0 + tol1, 1) - 
      tol2, tabl = tabdat, theta = thet)
   rmle <- optlst$param
   Scorstat <- (tabdat[1, 1]/sum(tabdat[1,  ]) - tabdat[2, 1]/sum(tabdat[2,
      ]) - thet)/sqrt((rmle * (1 - rmle))/sum(tabdat[1,  ]) + ((rmle - 
      thet) * (1 + thet - rmle))/sum(tabdat[2,  ]))
   mlepr <- c(tabdat[1, 1]/sum(tabdat[1,  ]), tabdat[2, 1]/sum(tabdat[2,  
      ]))
   loglik <- sum(tabdat[, 1] * log(mlepr)) + sum(tabdat[, 2] * log(1 - 
      mlepr))
   LRstat <- 2 * (loglik + optlst$obj)
   c(rmle = rmle, Scorstat = Scorstat, LRstat = LRstat, logLik = loglik)
}

> NuisBinRR
function(tabdat, thet, tol1 = 0.1, tol2 = 0.001)
{
### Coding of restricted ML estimation, along with score, Wald, 
###   and profile-likelihood confidence interval for parameter
###   theta which is risk-difference if type=suffix=D, risk-ratio 
###   if type=RR, and odds-ratio if type=OR.  
### 2x2 table entries are in  tabdat.
### In present set-up, we assume that sampling for types 1 or 2 is
###   row-binomial , and for type 3 is multinomial.
   mle0 <- sum(tabdat[, 1])/sum(tabdat)
   optlst <- nlminb(mle0, function(x, tabl, theta)
 - (tabl[1, 1] * log(x) + tabl[1, 2] * log(1 - x) + tabl[2, 1] * 
      log(x/theta) + tabl[2, 2] * log(1 - x/theta)), lower = pmax(
      mle0 - tol1, 0) + tol2, upper = pmin(mle0 + tol1, thet) - tol2, 
      tabl = tabdat, theta = thet)
   rmle <- optlst$param
   Scorstat <- (tabdat[1, 1]/sum(tabdat[1,  ]) - thet * (tabdat[2, 
      1]/sum(tabdat[2,  ])))/sqrt(rmle * ((1 - rmle)/sum(tabdat[1,  
      ]) + (thet - rmle)/sum(tabdat[2,  ])))
   mlepr <- c(tabdat[1, 1]/sum(tabdat[1,  ]), tabdat[2, 1]/sum(tabdat[2,  
      ]))
   loglik <- sum(tabdat[, 1] * log(mlepr)) + sum(tabdat[, 2] * log(1 - 
      mlepr))
   LRstat <- 2 * (loglik + optlst$obj)
   c(rmle = rmle, Scorstat = Scorstat, LRstat = LRstat, logLik = loglik)
}

> NuisBinOR
function(tabdat, thet, tol2 = 0.001)
{
### Coding of restricted ML estimation, along with score, Wald, 
###   and profile-likelihood confidence interval for parameter
###   theta which is risk-difference if type=suffix=D, risk-ratio 
###   if type=RR, and odds-ratio if type=OR.  
### 2x2 table entries are in  tabdat.
### In present set-up, we assume that sampling for types 1 or 2 is
###   row-binomial , and for type 3 is multinomial.
   mle0 <- sum(tabdat[, 1])/sum(tabdat)
   optlst <- nlminb(mle0, function(x, tabl, theta)
 - (tabl[1, 1] * log(x) + tabl[1, 2] * log(1 - x) + tabl[2, 1] * 
      log(x/(theta * (1 - x) + x)) + tabl[2, 2] * log((theta * 
      (1 - x))/(theta * (1 - x) + x))), lower = tol2, upper = 1 - 
      tol2, tabl = tabdat, theta = thet)
   rmle <- optlst$param
   rmle2 <- rmle/(rmle + thet * (1 - rmle))
   mlepr <- c(tabdat[1, 1]/sum(tabdat[1,  ]), tabdat[2, 1]/
      sum(tabdat[2, ]))
   Scorstat <- (tabdat[1, 1] * tabdat[2, 2] - thet * tabdat[1, 
      2] * tabdat[2, 1])/(sum(tabdat[1,  ]) * sum(tabdat[2,  ]) * 
      sqrt((rmle * (1 - rmle) * (thet * rmle2 + 1 - rmle2)^2)/
      sum(tabdat[1,  ]) + (rmle2 * (1 - rmle2) * (rmle + thet * 
      (1 - rmle))^2)/sum(tabdat[2,  ])))
   loglik <- sum(tabdat[, 1] * log(mlepr)) + sum(tabdat[, 2] * 
      log(1 - mlepr))
   LRstat <- 2 * (loglik + optlst$obj)
   c(rmle = rmle, Scorstat = Scorstat, LRstat = LRstat, logLik = loglik)
}


> ScorProfW
function(tabdat, alpha, thetarange, type = "D", npts = 50, tol = 0.001)
{
## function to call appropriate function NuisBinxx 
## to find confidence intervals using Wald, Score, and LR
## (profile) methods. Note that we need an input interval
## containing a suitable range of theta values to use in
## interpolating statistics as a function of theta.
## NB. alpha may be a vector
   num <- npts + 1
   zpts <- qnorm(1 - alpha/2)
   chisqpts <- qchisq(1 - alpha, 1)
   xt <- seq(min(thetarange), max(thetarange), length = num)
   SLRmat <- array(0, dim = c(num, 2))
   Statfcn <- get(paste("NuisBin", type, sep = ""))
   for(i in 1:num)
      SLRmat[i,  ] <- Statfcn(tabdat, xt[i])[2:3]
   WaldInts <- if(type == "D") (tabdat[1, 1]/sum(tabdat[1,  ])) - (tabdat[
         2, 1]/sum(tabdat[2,  ])) + outer(zpts, c(-1, 1)) * sqrt(
         (tabdat[1, 1] * tabdat[1, 2])/sum(tabdat[1,  ])^3 + (
         tabdat[2, 1] * tabdat[2, 2])/sum(tabdat[2,  ])^3) else 
      {
      if(type == "OR")
         ((tabdat[1, 1] * sum(tabdat[2,  ]))/(tabdat[2, 1] * sum(
            tabdat[1,  ]))) * exp(outer(zpts, c(-1, 1)) * 
            sqrt(tabdat[1, 2]/(sum(tabdat[1,  ]) * tabdat[1,
            1]) + tabdat[2, 2]/(sum(tabdat[2,  ]) * tabdat[
            2, 1])))
      else ((tabdat[1, 1] * tabdat[2, 2])/(tabdat[2, 1] * tabdat[1, 2
            ])) * exp(outer(zpts, c(-1, 1)) * sqrt(sum(1/
            tabdat)))
   }
   LRints <- replace(WaldInts, T, 0)
   tmpspl1 <- smooth.spline(SLRmat[, 1], xt, spar = 1e-06, all.knots = T)
   minpt <- (1:num)[SLRmat[, 2] == min(SLRmat[, 2])]
   tmpspl2A <- smooth.spline(SLRmat[1:minpt, 2], xt[1:minpt], spar = 1e-06,
      all.knots = T)
   tmpspl2B <- smooth.spline(SLRmat[minpt:num, 2], xt[minpt:num], spar = 
      1e-06, all.knots = T)
   ScorInts <- cbind(predict.smooth.spline(tmpspl1, zpts)$y, 
      predict.smooth.spline(tmpspl1,  - zpts)$y)
   LRInts <- cbind(predict.smooth.spline(tmpspl2A, chisqpts)$y, 
      predict.smooth.spline(tmpspl2B, chisqpts)$y)
   list(minpt = minpt, WaldInts = WaldInts, ScorInts = ScorInts, LRInts = 
      LRInts, thetavec = xt, Scorvec = SLRmat[, 1], LRvec = SLRmat[, 
      2])
}