library(survival)
library(parfm)
library(tibble)
library(dplyr)Frailty models for clustered data
The Diabetic Retinopathy Study was conducted in the US by the National Eye Institute to assess the effect of laser photocoagulation in delaying onset of severe visual loss (“blindness”) in patients with diabetic retinopathy. One eye of each patient was randomly selected for photocoagulation and the other was observed without treatment. The patients were followed over several years for the occurrence of blindness (event = blindness) in their left and right eyes. Censoring is caused by death, dropout, or end of the study. We consider only a subset of the original data set containing \(197\) high risk patients. See Huster et al, 1989, Biometrics, for a discussion and further references.
There are two lines for each patient in the data set: one line per eye. The variables are:
- id: patient number
- trteye: treated eye (1=right, 2=left)
- ageonset: age at diagnosis of diabetes
- typediab: type of diabetes (1= juvenile age at onset < 20 years , 2=adult)
- time: follow-up time in months
- status: status for eye (0=censored, 1=blindness)
- trt: treatment of eye (1=control eye, 2=treated eye)
Before you start, you will have to read the data. Use the R-command:
eye=read.table("https://www.med.uio.no/imb/english/research/centres/ocbe/courses/imb9335/retinopathy.txt", header=T) %>% as_tibble()
eye# A tibble: 394 × 7
id trteye ageonset typediab status time trt
<int> <int> <int> <int> <int> <dbl> <int>
1 5 2 28 2 0 46.2 1
2 14 1 12 1 1 31.3 1
3 16 1 9 1 0 42.3 1
4 25 2 9 1 0 20.6 1
5 29 2 13 1 1 0.3 1
6 46 1 12 1 1 54.3 1
7 49 1 8 1 1 10.8 1
8 56 1 12 1 0 23.2 1
9 61 1 16 1 0 1.47 1
10 71 1 21 2 1 13.8 1
# ℹ 384 more rows
We start out by only using treatment as a covariate.
- We first fit a model with Weibull baseline and no frailty by the command:
nofrail.model = parfm(Surv(time,status)~factor(trt),data=eye)
nofrail.model
Frailty distribution: none
Baseline hazard distribution: Weibull
Loglikelihood: -836.379
ESTIMATE SE p-val
rho 0.810 0.058
lambda 0.032 0.008
factor(trt)2 -0.790 0.169 <.001 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ci.parfm(nofrail.model) low up
factor(trt)2 0.326 0.632
Perform the command, and make sure that you understand what the output tells you.
- We then fit a model with Weibull baseline and gamma frailty:
frail.model = parfm(Surv(time,status)~factor(trt),cluster="id", frailty="gamma", data=eye)
frail.model
Frailty distribution: gamma
Baseline hazard distribution: Weibull
Loglikelihood: -827.744
ESTIMATE SE p-val
theta 1.040 0.329
rho 0.948 0.075
lambda 0.028 0.007
factor(trt)2 -0.960 0.182 <.001 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Kendall's Tau: 0.342
Perform the command, and compare with the result for the model without frailty. Is there a significant frailty effect?
LR=2*(attributes(frail.model)$loglik - attributes(nofrail.model)$loglik)
LR[1] 17.26943
(1-pchisq(LR, 1))/2 # YES![1] 1.621814e-05
- Significant risk variation between the clusters (individuals here)!
- Fit a model with gamma frailty and treatment and age at onset as covariates. Is there a significant effect of age at onset? What about type of diabetes?
frail.model.ageonset = parfm(Surv(time,status)~factor(trt) + ageonset, cluster="id", frailty="gamma", data=eye)
frail.model.ageonset # NO! (coef close to 0, p-value large)
Frailty distribution: gamma
Baseline hazard distribution: Weibull
Loglikelihood: -827.469
ESTIMATE SE p-val
theta 1.034 0.328
rho 0.948 0.075
lambda 0.025 0.008
factor(trt)2 -0.966 0.182 <.001 ***
ageonset 0.006 0.008 0.456
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Kendall's Tau: 0.341
frail.model.typediab = parfm(Surv(time,status)~factor(trt) + factor(typediab), cluster="id", frailty="gamma", data=eye)
frail.model.typediab # NO! (coef close to 0, p-value large)
Frailty distribution: gamma
Baseline hazard distribution: Weibull
Loglikelihood: -827.73
ESTIMATE SE p-val
theta 1.037 0.329
rho 0.947 0.075
lambda 0.028 0.008
factor(trt)2 -0.961 0.182 <.001 ***
factor(typediab)2 0.040 0.232 0.864
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Kendall's Tau: 0.341
- Fit a Cox frailty model (with gamma frailty) using treatment as the only covariate (see lectures for R commands). Compare the results with those you obtained in question b.
frail.cox = coxph(Surv(time,status) ~ factor(trt) + frailty(id), data=eye)
frail.coxCall:
coxph(formula = Surv(time, status) ~ factor(trt) + frailty(id),
data = eye)
coef se(coef) se2 Chisq DF p
factor(trt)2 -0.910 0.174 0.171 27.295 1.0 1.7e-07
frailty(id) 114.448 84.6 0.017
Iterations: 6 outer, 30 Newton-Raphson
Variance of random effect= 0.854 I-likelihood = -850.9
Degrees of freedom for terms= 1.0 84.6
Likelihood ratio test=201 on 85.6 df, p=3e-11
n= 394, number of events= 155
# cox with no frailty
cox = coxph(Surv(time, status) ~ factor(trt), data = eye)
LRcox = 2*(frail.cox$history$frailty$c.loglik - cox$loglik[2])
LRcox # log-likelihood ratio[1] 11.88032
1 - pchisq(LRcox,df=1)[1] 0.0005673029
- Finally we want to illustrate how stratified Cox regression can be used to analyse paired survival data like the ones we have in the present situation. We may fit a stratified Cox model with a separate baseline for each patient (per eye pair pretty much) by the commands:
eyefit.strat=coxph(Surv(time,status)~trt+strata(id),data=eye)
summary(eyefit.strat)Call:
coxph(formula = Surv(time, status) ~ trt + strata(id), data = eye)
n= 394, number of events= 155
coef exp(coef) se(coef) z Pr(>|z|)
trt -0.9623 0.3820 0.2016 -4.773 1.82e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
trt 0.382 2.618 0.2573 0.5672
Concordance= 0.748 (se = 0.058 )
Likelihood ratio test= 25.49 on 1 df, p=4e-07
Wald test = 22.78 on 1 df, p=2e-06
Score (logrank) test = 24.59 on 1 df, p=7e-07
Explain the idea behind such an approach, and discuss how it compares to the approach in question d).
We got same approximately results! Why? See equation (slide no. 24): \(a_{ij}(t|Z_i) = Z_i a_0(t) e^{\beta x_{ij}}\)
- Cox model uses different baseline hazards per stratum (\(Z_i=1\))
- Parametric Frailty models are better in most practical applications (\(a_0(t)\) same for all individuals, \(Z_i\) varies and is estimated)