R Script on Life Tables -- First, from given (simulated) dataset) and Second, from Theoretical Distribution withoput and with Right-Censoring ============================================================================= Eric Slud 9/7/21 Start by Generating Gamma Survival Times with Uniform Censoring and then Rounding Both to Multiples of 0.25 (think of time units as years, and the time reported to the nearest quarter) # First generate underlying times, then convert to T_i, Delta_i set.seed(3535) Xi = rgamma(100, 5, 2) > summary(Xi) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.5148 1.6432 2.4231 2.4977 3.1026 6.2471 > Ci = runif(100, 0.5, 5.5) > Ti = round(4*pmin(Xi,Ci))/4 > Di = as.numeric(Xi <= Ci) ## (Ti, Di) pairs are in the usual (Event-time, Status) format # for right-censored survival data # Now create a life-table > Times = unique(sort(Ti)) > length(Times) ## 18 values, 0.5 to 4.75 inclusive [1] 18 > AtRsk = Dths = numeric(length(Times)) for(i in 1:length(Times)) { AtRsk[i] = sum(Ti >= Times[i] ) Dths[i] = sum(Di[ Ti==Times[i] ]) } > cbind(Times, AtRsk, Dths, Cens = AtRsk - c(AtRsk[-1],0)-Dths) Times AtRsk Dths Cens [1,] 0.50 100 2 3 [2,] 0.75 95 2 4 [3,] 1.00 89 6 9 [4,] 1.25 74 4 2 [5,] 1.50 68 6 1 [6,] 1.75 61 6 5 [7,] 2.00 50 7 3 [8,] 2.25 40 4 6 [9,] 2.50 30 6 3 [10,] 2.75 21 4 0 [11,] 3.00 17 2 1 [12,] 3.25 14 2 3 [13,] 3.50 9 3 0 [14,] 3.75 6 1 0 [15,] 4.00 5 0 1 [16,] 4.25 4 2 0 [17,] 4.50 2 1 0 [18,] 4.75 1 1 0 ### Here are slicker R commands for generating the columns without a for-loop ## (They would be faster than for-loops in a very large dataset) tmp=table(Ti) cum.evt = as.vector(cumsum(c(0,tmp))) Times = as.numeric(names(tmp)) AtRsk = 100 - cum.evt[1:length(tmp)] Dths = diff(c(0,cumsum(Di[order(Ti)])[cum.evt])) cbind(Times, AtRsk, Dths, Cens = AtRsk - c(AtRsk[-1],0)-Dths) Times AtRsk Dths Cens [1,] 0.50 100 2 3 [2,] 0.75 95 2 4 [3,] 1.00 89 6 9 [4,] 1.25 74 4 2 [5,] 1.50 68 6 1 [6,] 1.75 61 6 5 [7,] 2.00 50 7 3 [8,] 2.25 40 4 6 [9,] 2.50 30 6 3 [10,] 2.75 21 4 0 [11,] 3.00 17 2 1 [12,] 3.25 14 2 3 [13,] 3.50 9 3 0 [14,] 3.75 6 1 0 [15,] 4.00 5 0 1 [16,] 4.25 4 2 0 [17,] 4.50 2 1 0 [18,] 4.75 1 1 0 #=============================================================== Now do theoretical life-table with l_0 = 100,000 based first on Gamma(5,2) distribution with no right-censoring and then with Unif(.5,5.5) Type II right-censoring ## No-censoring case first: times = (0:25)/4 atrsk = round(1e5*(1- pgamma(times,5,2))) dths = -diff(c(1e5,atrsk)) cbind(times,atrsk,dths) times atrsk dths [1,] 0.00 100000 0 [2,] 0.25 99983 17 [3,] 0.50 99634 349 [4,] 0.75 98142 1492 [5,] 1.00 94735 3407 [6,] 1.25 89118 5617 [7,] 1.50 81526 7592 [8,] 1.75 72544 8982 [9,] 2.00 62884 9660 [10,] 2.25 53210 9674 [11,] 2.50 44049 9161 [12,] 2.75 35752 8297 [13,] 3.00 28506 7246 [14,] 3.25 22367 6139 [15,] 3.50 17299 5068 [16,] 3.75 13206 4093 [17,] 4.00 9963 3243 [18,] 4.25 7436 2527 [19,] 4.50 5496 1940 [20,] 4.75 4026 1470 [21,] 5.00 2925 1101 [22,] 5.25 2109 816 [23,] 5.50 1510 599 [24,] 5.75 1075 435 [25,] 6.00 760 315 [26,] 6.25 535 225 ## life-table would continue with 535-225 = 310 left at risk ## or else we could right-censor those 310 after 6.25 ## and there would be no one left at risk at time 6.50. ## Unif(0.5, 5.5) independent right-censoring ## Will cover the calculus part of these next calculations on Friday 9/10 atrsk2 = (1- pgamma(times[1:23],5,2))* (0.2*(5.5-pmax(0.5,times[1:23]))) dth2 = dth3 = c(atrsk2[1:2]-atrsk2[2:3], rep(0,21)) for (i in 3:22) dth2[i] = integrate(function(x) dgamma(x,5,2)*(0.2*(5.5-x)), (i-1)/4, i/4)\$val ### compare these values with the formula developed in class: dth3[3:22] = (5.5/5)*(pgamma((3:22)/4,5,2)-pgamma((2:21)/4,5,2)) - 0.2*(5/2)*(pgamma((3:22)/4,6,2)-pgamma((2:21)/4,6,2)) > summary(dth2-dth3) Min. 1st Qu. Median Mean 3rd Qu. Max. -1.527e-16 -5.573e-17 0.000e+00 9.899e-19 4.467e-17 1.908e-16 ### so far, these atrsk2 and dth2 and dth3 quantities are probabilities ## now to make the life table we multiply by "radix"=1e5 and round. ## so the life-table is: tmp = cbind(time=times[1:23], round(1e5*cbind(atrsk=atrsk2, dth=dth3)), cens=0) tmp[,4] = tmp[1:23,2]-c(tmp[2:23,2],0)-tmp[,3] tmp time atrsk dth cens [1,] 0.00 100000 17 0 [2,] 0.25 99983 349 0 [3,] 0.50 99634 1448 4951 [4,] 0.75 93235 3143 4831 [5,] 1.00 85261 4906 4605 [6,] 1.25 75750 6256 4273 [7,] 1.50 65221 6957 3856 [8,] 1.75 54408 7003 3386 [9,] 2.00 44019 6531 2901 [10,] 2.25 34587 5729 2428 [11,] 2.50 26430 4775 1992 [12,] 2.75 19663 3809 1601 [13,] 3.00 14253 2920 1268 [14,] 3.25 10065 2158 987 [15,] 3.50 6920 1539 759 [16,] 3.75 4622 1057 576 [17,] 4.00 2989 698 432 [18,] 4.25 1859 439 321 [19,] 4.50 1099 259 236 [20,] 4.75 604 139 172 [21,] 5.00 293 62 126 [22,] 5.25 105 16 89 [23,] 5.50 0 0 0