Cox modelling example using "cancer" May Clinic Lung Cancer Data ====================================================== 4/25/08 > library(survival) attach(cancer) #### 228 patients in many "institutions" names(cancer) [1] "inst" "time" "status" "age" "sex" "ph.ecog" [7] "ph.karno" "pat.karno" "meal.cal" "wt.loss" inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss 1 3 306 2 74 1 1 90 100 1175 NA 2 3 455 2 68 1 0 90 90 1225 15 3 3 1010 1 56 1 0 90 90 NA 15 > table(inst,sex) sex inst 1 2 ### Many different "institutions" 1 24 12 ### Far more males than females in most of them 2 2 3 3 11 8 4 3 1 5 5 4 6 10 4 7 5 3 10 1 3 11 10 8 12 15 8 13 13 7 15 4 2 16 8 8 21 10 3 22 12 5 26 1 5 32 2 5 33 1 1 > detach() ### try Cox modelling using age ,sex and performance scores ### as covariates: > mod1 = coxph(Surv(time,status) ~ age + sex + ph.karno, data=cancer) > summary(mod1)\$coef coef exp(coef) se(coef) z p age 0.01237540 1.0124523 0.00940461 1.315887 0.190 sex -0.49717029 0.6082494 0.16771275 -2.964416 0.003 ph.karno -0.01332184 0.9867665 0.00588006 -2.265596 0.023 > table(cancer\$sex) 1 2 138 90 > mod2 = coxph(Surv(time,status) ~ age + strata(sex) + ph.karno, data=cancer) > summary(mod2)\$coef coef exp(coef) se(coef) z p age 0.01145752 1.011523 0.009365702 1.223349 0.220 ph.karno -0.01413648 0.985963 0.005964414 -2.370137 0.018 > mod3A = coxph(Surv(time,status) ~ age + ph.karno, data=cancer[cancer\$sex==1,]) mod3B = coxph(Surv(time,status) ~ age + ph.karno, data=cancer[cancer\$sex==2,]) > summary(mod3A)\$coef coef exp(coef) se(coef) z p age 0.01701869 1.0171643 0.011476600 1.482903 0.14 ph.karno -0.00798655 0.9920453 0.006792772 -1.175743 0.24 > summary(mod3B)\$coef coef exp(coef) se(coef) z p age -0.01074669 0.9893109 0.01749253 -0.6143586 0.5400 ph.karno -0.04129614 0.9595449 0.01384292 -2.9831965 0.0029 ### LOOK FOR DIFFERENCE BETWEEN ph.karno COEFFS WITHIN THE ### TWO SEX GROUPS (a) with baseline hazards different ### and (b) baseline hazard the same. > hazM = basehaz(mod3A) hazF = basehaz(mod3B) > pdf(file="SexHazPlot.pdf") > plot(hazM\$time, hazM\$haz, xlab="Time", ylab="Cum Hazard", pch=5, main=paste( "Baseline hazards, by Sex, in CANCER data \n", "Hollow diamonds Male, solid Female",sep="")) points(hazF\$time, hazF\$haz, pch=18) ### Far different! ## but probably not significantly different apart from multiple dev.off() ### saves pdf picture > mod4 = coxph(Surv(time,status) ~ sex + ph.karno + sex:ph.karno, data=cancer) ### First test: coeff difference based on indep samples > (mod3A\$coef[2] - mod3B\$coef[2])/sqrt(mod3A\$var[2,2]+ mod3B\$var[2,2]) ph.karno 2.160193 ### 2-sided p-value .031 ### Second test: if there were a sex difference using SAME ### baseline hazard, then the interaction term would ### be significant, but it is not (quite)! > summary(mod4)\$coef coef exp(coef) se(coef) z p sex 1.53181187 4.6265520 1.10383630 1.3877165 0.170 ph.karno 0.01577181 1.0158968 0.01784156 0.8839931 0.380 sex:ph.karno -0.02524945 0.9750667 0.01367025 -1.8470365 0.065