SOLUTION TO
Homework 11, Assigned 10/16/17, due 10/25, 6pm
==============================================       14 points

Consider the dataset of 123 (uncensored) failure-time observations obtained by downloading

exa1 = read.csv("https://raw.githubusercontent.com/CodeOwl94/ross-reliability/master/EXA1.csv",
                 header=T)

and retaining only uncensored equipment failure times

failtim = exa1$time[exa1$fail=="F"]

Common distributions used to describe failure-time data are Gamma and logNormal, both 
2-parameter distributions.

(a) Fit   lognormal(mu,sig)  and   Gamma(a,b) densities to these data by the method of 
maximum likelihood. We refer to the 2-dimensional parameter for these models as theta

> summary(failtim)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   3.00    6.50   12.00   19.35   22.50  114.00 
> fitLN = optim( c(mean(log(failtim)), var(log(failtim))*122/123), function(th,times) 
             (length(times)/2)*log(th[2]) + sum((times-th[1])^2)/(2*th[2]),
             times = log(failtim), control=list(reltol=1e-10), hessian=T)

> c(mean(log(failtim)), var(log(failtim))*122/123)
[1] 2.5667004 0.7675048

> fitLN
$par
[1] 2.5667004 0.7675048

$value
[1] 45.22645

$counts
function gradient 
      65       NA 

$convergence
[1] 0

$message
NULL

$hessian
         [,1]    [,2]
[1,] 160.2596   0.000
[2,]   0.0000 104.404

> fitGA = optim( (fitLN$par[1]/fitLN$par[2])*c(fitLN$par[1],1), function(th,times) 
             -sum(log(dgamma(times,th[1],th[2]))), 
             times = failtim, control=list(reltol=1e-10), hessian=T)

> fitGA
$par
[1] 1.4062916 0.0726761

$value
[1] 483.4367

$counts
function gradient 
      97       NA 

$convergence
[1] 0

$message
NULL

$hessian
           [,1]      [,2]
[1,]   125.3579 -1692.548
[2,] -1692.5477 32761.351


(b) For each of the distributions fitted in (a), display the (empirical, or observed) 
Fisher Information matrices A, and compare these with the alternative-form matrices B
given by (1/n) sum_{i=1}^n v_i v_i^{tr}   where   v_i  is the gradient of the log-density 
evaluated at the i'th data-value X_i and the MLE parameter value.


### The empirical per-observation Fisher information matrices (matrices A) were given above,

> fitLN$hessian/123
         [,1]      [,2]
[1,] 1.302923 0.0000000
[2,] 0.000000 0.8488134

> fitGA$hessian/123
          [,1]      [,2]
[1,]   1.01917 -13.76055
[2,] -13.76055 266.35245

### The corresponding matrices B for the two models are as follows.

> xvec = (log(failtim)-fitLN$par[1])/fitLN$par[2]
  gradLN = cbind(xvec, 0.5*(-1/fitLN$par[2] + xvec^2) )
  Bmat.LN = (1/123)*(t(gradLN) %*% gradLN)
> Bmat.LN
         xvec          
xvec 1.302923 0.2177550
     0.217755 0.5446037

> gradGA = cbind( log(fitGA$par[2]*failtim) - digamma(fitGA$par[1]),
                 fitGA$par[1]/fitGA$par[2] - failtim )
  Bmat.GA = (1/123)*(t(gradGA) %*% gradGA)
> Bmat.GA
            [,1]      [,2]
[1,]   0.7675048 -15.14836
[2,] -15.1483627 375.75583

### In both of these A versus B matrix comparisons, there seems to be clear inequality !!

### (c) 
### For Lognormal model we found point estimators
> fitLN$par
[1] 2.5667004 0.7675048

### for which we would estimate variance-covariance
> solve(fitLN$hess)           ### if we though the LN model were properly specified 
            [,1]        [,2]
[1,] 0.006239876 0.000000000
[2,] 0.000000000 0.009578173

### and otherwise (in misspecified case) the "robustified" version
> solve(fitLN$hess) %*% (123*Bmat.LN) %*% solve(fitLN$hess) 
            [,1]        [,2]
[1,] 0.006239876 0.001600781
[2,] 0.001600781 0.006145412     ### very different in [2,2] element for var(sigsq)

##For the Gamma model, we have point estimators
> fitGA$par
[1] 1.4062916 0.0726761

## and would report if properly specified a variance matrix

> solve(fitGA$hess)
            [,1]        [,2]
[1,] 0.026374164 0.001362567
[2,] 0.001362567 0.000100918

### and in robustified (misspecified-model) setting the variance matrix

> solve(fitGA$hess) %*% (123*Bmat.GA) %*% solve(fitGA$hess) 
           [,1]         [,2]
[1,] 0.01755671 0.0013292802
[2,] 0.00132928 0.0001335505 

### which is really quite close to the inverse observed Fisher Info for the Gamma model!!
### But this is less an indicator of good fit than happenstance in this problem,

(d) Overplot the histogram for the log-transformed data with the density curves for your fitted 
models (a) (in blue) and (b) (in red), and label your picture appropriately with x 
and y axis labels, a title, and a "legend" showing which line is associated with which model.

> hist(log(failtim), prob=T, xlab="log failtim", ylab="density",
     main=paste("Hist of log failtim Data Overplotted \n",
     "with 2 fitted transformed Densities"), ylim=c(0,0.55), xlim=c(1,6))
  curve(dnorm(x,fitLN$par[1],fitLN$par[2]), add=T, col="blue")
  curve(exp(x)*dgamma(exp(x),fitGA$par[1],fitGA$par[2]), add=T, col="red", lty=3)
  legend(locator(), legend=c("Fitted LogNormal", "Fitted Gamma"), 
     lty=c(1,3), col=c("blue","red"))

## saved as "FailTimPic.pdf"
### (e) Both models look as though they fit quite poorly for these data ! If I had to 
### choose, I would choose based on the closeness of the histogram versus overplotted 
### density, in which case lognormal looks somewhat better, and a chi-squared test 
based on several cells such as [0,1.5), [1.5,2.5), [2.5,4), [4,6] will show this.

> diff(c(0,sum(log(failtim)<=1.5), sum(log(failtim)<= 2.5), sum(log(failtim)<=4), 
               sum(log(failtim)<=6)))
[1] 14 50 50  9

> chisq.test(diff(c(0,sum(log(failtim)<=1.5), sum(log(failtim)<= 2.5), 
             sum(log(failtim)<=4), sum(log(failtim)<=6))), 
             p=diff(c(0,pnorm(c(1.5,2.5,4,Inf), 2.5667004, sqrt(0.7675048)))))

        Chi-squared test for given probabilities

data:  diff(c(0, sum(log(failtim) <= 1.5), sum(log(failtim) <= 2.5),     
               sum(log(failtim) <= 4), sum(log(failtim) <= 6)))
X-squared = 3.3756, df = 3, p-value = 0.3373

> chisq.test(diff(c(0,sum(log(failtim)<=1.5), sum(log(failtim)<= 2.5), 
             sum(log(failtim)<=4), sum(log(failtim)<=6))), 
             p=diff(c(0,pgamma(exp(c(1.5,2.5,4,10)), 1.4062916, 0.0726761))))

        Chi-squared test for given probabilities

data:  diff(c(0, sum(log(failtim) <= 1.5), sum(log(failtim) <= 2.5), 
             sum(log(failtim) <= 4), sum(log(failtim) <= 6)))
X-squared = 15.464, df = 3, p-value = 0.00146

### NOTE that neither of these chi-squared p-values is correct, because the degrees
###  of freedom are reduced by the number of estimated parameters and even then, 
###  only if the estimated parameters are the MLEsbased on the counts in the intervals alone!