breslow:day:80 analyse a set of data from a case-control study relating endometrial cancer with exposure to estrogens. 183 pairs of cases and controls were studied, and the full data is shown below.
We denote
estrogen exposure as
for the ith case-control pair, where
j=1 for a case and j=2 for a control. The
conditional likelihood for the log (odds ratio)
is then given by
We shall illustrate three methods of fitting this model. It is convenient
to denote the fixed disease status as a variable
.
First, Breslow and Day point out that for case-control studies with a single control per case, we may obtain this likelihood by using unconditional logistic regression for each case-control pair. That is
Second, the BUGS manual section Conditional likelihoods in case-control studies discusses fitting this likelihood directly by assuming the model
Finally, the BUGS manual shows how the multinomial-Poisson
transformation can be used. In general, this will be more efficient than using the multinomial-logistic parameterisation above, since it avoids the time-consuming evaluation of
. However, in the present example this summation is only over J=2 elements, whilst the multinomial-Poisson parameterisation involves estimation of an additional intercept parameter for each of the 183 strata. Consequently the latter is less efficient than the multinomial-logistic in this case.
We note that all these formulations may be easily extended to include additional subject-specific covariates, and that the second and third methods can handle arbitrary numbers of controls per case. In addition, the Bayesian approach allows the incorporation of hierarchical structure, measurement error, missing data and so on.
The graph for the conditional likelihood model is shown in Figure 22.
All these techniques are illustrated in the code given below, which includes a transformation of the original summary statistics into full data. In this example, all but the second conditional-likelihood approach are commented out.
Endo: model specification in BUGS
model endo;
const
I = 183, # number of matched sets
J = 2; # number of people per set
var
n10,n01,n11,n00, # collapses form of data
Y[I,J], # observed disease status
p[I,J], # probability of disease status
e[I,J], # exp ( beta ' x )
est[I,J] , # estrogen use
mu[I,J], # Poisson means
beta0[I], # baseline rates in each stratum
beta; # covariate coefficient
data n10,n01,n11,n00 in "endo.dat";
inits in "endo.in";
{
# transform collapsed data into full
for (i in 1:I){ Y[i,1] <- 1; Y[i,2] <- 0;}
# loop around strata with case exposed, control not exposed (n10)
for (i in 1:n10){ est[i,1] <- 1; est[i,2] <- 0;}
# loop around strata with case not exposed, control exposed (n01)
for (i in (n10+1):(n10+n01)){ est[i,1] <- 0; est[i,2] <- 1;}
# loop around strata with case exposed, control exposed (n11)
for (i in (n10+n01+1):(n10+n01+n11)){ est[i,1] <- 1; est[i,2] <- 1;}
# loop around strata with case not exposed, control not exposed (n00)
for (i in (n10+n01+n11+1):I){ est[i,1] <- 0; est[i,2] <- 0;}
# PRIORS
for (i in 1:I) { beta0[i] ~ dnorm(0,1.0E-6) } beta ~ dnorm(0,1.0E-6) ;
# LIKELIHOOD
for (i in 1:I) { # loop around strata
# METHOD 1 - logistic regression
# Y[i,1] ~ dbin( p[i,1], 1);
# logit(p[i,1]) <- beta * (est[i,1] - est[i,2]);
# METHOD 2 - conditional likelihoods
Y[i,] ~ dmulti( p[i,],1);
for (j in 1:2){
p[i,j] <- e[i,j] / sum(e[i,]);
log( e[i,j] ) <- beta * est[i,j] ;
}
# METHOD 3 fit standard Poisson regressions relative to baseline
# for (j in 1:2) {
# Y[i,j] ~ dpois(mu[i,j]);
# log(mu[i,j]) <- beta0[i] + beta*est[i,j];
# }
}
Figure 22:
Graphical model for endo example using the conditional-likelihood approach.
A BUGS run of 500 burn-in and 1000 iterations, took the times shown below and gave the following (very similar) results.