Several authors have discussed Bayesian inference for censored survival data where the integrated baseline hazard function is to be estimated non-parametrically [Kalbfleisch1978, Kalbfleisch and Prentice1980, Clayton1991, Clayton1994]. clayton:94 formulates the Cox model using the counting process notation introduced by andersen:gill:82 and discusses estimation of the baseline hazard and regression parameters using MCMC methods. Although his approach may appear somewhat contrived, it forms the basis for extensions to random effect (frailty) models, time-dependent covariates, smoothed hazards, multiple events and so on. We show below how to implement this formulation of the Cox model in BUGS.
For subjects i = 1,...,n, we observe processes
which count the number of failures which have occurred up to time t. The corresponding intensity process
is given by
where d
is the increment of
over the small time interval
, and
represents the available data just before time t. If subject i is observed to fail during this time interval, d
will take the value 1; otherwise d
= 0. Hence
corresponds to the probability of subject i failing in the interval
. As dt
(assuming time to be continuous) then this probability becomes the instantaneous hazard at time t for subject i. This is assumed to have the proportional hazards form
where
is an observed process taking the value 1 or 0 according to whether or not subject i is observed at time t and
is the familiar Cox regression model. Thus we have observed data
and unknown parameters
and
du, the latter to be estimated non-parametrically.
The joint posterior distribution for the above model is defined by
For BUGS, we need to specify the form of the likelihood
and prior distributions for
and
. Under non-informative censoring, the likelihood of the data is proportional to
This is essentially as if the counting process increments d
in the time interval
are independent Poisson random variables with means
dt:
We may write
where d
dt is the increment or jump in the integrated baseline hazard function occurring during the time interval
. Since the conjugate prior for the Poisson mean is the gamma distribution, it would be convenient if
were a process in which the increments d
are distributed according to gamma distributions. We assume the conjugate independent increments prior suggested by kalbfleisch:78, namely
Here, d
can be thought of as a prior guess at the unknown hazard function, with c representing the degree of confidence in this guess. Small values of c correspond to weak prior beliefs. In the example below, we set d
dt where r is a guess at the failure rate per unit time, and dt is the size of the time interval.
The above formulation is appropriate when genuine prior information exists concerning the underlying hazard function. Alternatively, if we wish to reproduce a Cox analysis but with, say, additional hierarchical structure, we may use the multinomial-Poisson trick described in the BUGS manual. This is equivalent to assuming independent increments in the cumulative hazard function as failure times, whose logarithms are given `non-informative' priors. This formulation is also shown below.
The fixed effect regression coefficients
are assigned a vague prior
The graph for the Cox model is shown in Figure 20.
Figure 20:
Graphical model for leuk example
A widely used example in survival analysis is freireich:etal:63's data comparing time to remission of leukaemia in patients receiving a new drug (6-MP) with control patients. The data are reproduced below.
indicates censoring
The BUGS code to analyse this data using the Cox proportional hazards model is as follows
Model specification
model leuk;
const
N = 42, # number of patients
T = 17, # number of unique failure times
eps = 0.000001; # used to guard against numerical
# imprecision in step function
var
obs.t[N], # observed failure or censoring time for each patient
t[T+1], # unique failure times + maximum censoring time
dN[N,T], # counting process increment
Y[N,T], # 1=subject observed; 0=not observed
Idt[N,T], # intensity process
Z[N], # covariate
beta, # regression coefficient
dL0[T], # increment in unknown hazard function
beta0[T], # log(increment in unknown hazard function)
dL0.star[T], # prior guess at hazard function
c, # degree of confidence in prior guess for dL0
mu[T], # location parameter for Gamma (= c * dL0.star)
r, # prior guess at failure rate
fail[N], # failure = 1; censored = 0
S.treat[T], # survivor function for treatment group
S.placebo[T]; # survivor function for placebo group
data obs.t, fail, Z in "leuk.dat", t in "failtime.dat";
inits in "leuk.in";
{
# Set up data
for(i in 1:N) {
for(j in 1:T) {
# risk set = 1 if obs.t >= t
Y[i,j] <- step(obs.t[i] - t[j] + eps);
# counting process jump = 1 if obs.t in [ t[j], t[j+1] )
# i.e. if t[j] <= obs.t < t[j+1]
dN[i,j] <- Y[i,j]*step(t[j+1] - obs.t[i] - eps)*fail[i];
}
}
# Model
for(j in 1:T) {
# beta0[j] ~ dnorm(0,0.001); # include this when using Poisson trick
for(i in 1:N) {
dN[i,j] ~ dpois(Idt[i,j]); # Likelihood
Idt[i,j] <- Y[i,j]*exp(beta*Z[i])*dL0[j]; # Intensity
# Try Poisson trick - independent log-normal hazard increments
# - enables dL0, c, r, mu to be dropped from model
# Idt[i,j] <- Y[i,j]*exp(beta0[j]+beta*Z[i]); # Intensity
}
dL0[j] ~ dgamma(mu[j], c);
mu[j] <- dL0.star[j] * c; # prior mean hazard
# Survivor function = exp(-Integral{l0(u)du})^exp(beta*z)
S.treat[j] <- pow(exp(-sum(dL0[1:j])), exp(beta * -0.5));
S.placebo[j] <- pow(exp(-sum(dL0[1:j])), exp(beta * 0.5));
}
c <- 0.001; r <- 0.1;
for (j in 1:T) {
dL0.star[j] <- r * (t[j+1]-t[j])
}
beta ~ dnorm(0.0,0.000001);
}
Here, obs.t[i] is the follow-up time for patient i (i = 1,...,42), with fail[i] indicating whether this corresponds to a failure or a censored observation.
Data
Part of the data file leuk.dat is shown below: column 1 refers to obs.t, column 2 is fail and column 3 is Z
1 1 0.5 1 1 0.5 2 1 0.5 . . . . . . 17 1 0.5 22 1 0.5 23 1 0.5 6 1 -0.5 6 1 -0.5 6 1 -0.5 6 0 -0.5 7 1 -0.5 . . . . . . 32 0 -0.5 34 0 -0.5 35 0 -0.5
A separate data file (failtime.dat) contains the 17 distinct failure times t[j] (j = 1,...,17) plus t[18]
, the maximum follow-up time. These values define the time intervals
used for calculating the counting process increments d
. Note that the upper end of the interval is defined to be strictly
, so the counting process for a subject who fails at exactly time
will not be incremented until the following time interval. Consequently, if
represents a failure time rather than a censoring time, an arbitrary value
must be chosen for the final value of the vector t[]. This ensures that the counting process for this subject is incremented during the final time interval.
Time-dependent covariates could be included by making Z[i,j] the value for the ith individual at the jth failure time.
We note the use of the step function to create the counting process increments dN[i,j] = d
and the risk set process Y[i,j] =
. This function takes the value 1 if its argument is
0, and 0 otherwise. Thus step(time[i] - t[j] + eps) returns the value 1 for Y[i,j] if the follow-up time time[i] for patient i
the current failure time t[j], and 0 thereafter. A similar trick is used in the construction of dN[i,j].
Note that the variable dN[i,j] appears twice on the left-hand side of a statement. Under the section labelled # TRANSFORMATIONS, dN[i,j] is a deterministic node, whilst under the # MODEL section, it features as a random node. This construction is permitted in BUGS to allow creation of data nodes within the program session, rather than having to carry out all data transformations before setting up the .dat file (see Section 4.9 of the BUGS manual for more details).
We also calculate the conditional survivor function given covariates z. This is given by
where
d
d
.
Analysis
1000 iterations took 41 seconds after a 500 iteration burn-in. The posterior mean (standard error) of the regression coefficient
was 1.55 (0.43). This compares with the standard partial likelihood estimate (obtained using the SAS PHREG procedure) of 1.59 (0.43). whitehead:80 estimated
to be 1.51 (0.42) using the Poisson model formulation in GLIM: using
the analogous Poisson trick in BUGS gave an estimate
of 1.54 (.41). The estimated survival probabilities are shown in Figure 21.
Figure 21:
Estimated mean survivor function and 95% credible interval (shaded region) for treatment and placebo groups in the leuk example
Tied failure times
The Poisson likelihood formulation of the Cox model outlined above assumes that time is truly continuous, in the sense that no two failures can occur simultaneously. Hence the analysis described here is not strictly correct, since we have allowed multiple failures to occur at any one time. This analysis actually corresponds to peto:72's treatment of ties. An alternative approach is to randomly perturb the failure times prior to analysis to ensure distinct failure times for each subject. However, this is somewhat conservative since it assumes the null hypothesis to be true. A better method is to simulate new perturbations of the failure times at each iteration of the Gibbs sampler, conditional upon the data and the current values of other model parameters. Unfortunately, the declarative structure of the BUGS language does not permit this model at present, although future versions of the program may include it.