spiegelhalter:stovin:83 presented data on repeated biopsies of transplanted hearts, in which a total of 414 biopsies had been taken at 157 sessions. Each biopsy was graded on evidence of rejection using a 4-category scale of none (O), minimal (M), mild (+) and moderate-severe (++). Part of the data is shown below.
The sampling procedure may not detect the area of maximum
rejection, which is considered
the true underlying state at the time of
the session and denoted
-- the underlying probability
distribution of the four true states is denoted by the vector p. It is then assumed that each of the
observed biopsies are conditionally independent given this true
state with the restriction that there are no
`false positives': i.e. one cannot observe a biopsy worse
than the true state. We then have the sampling model
where
denotes the multinomial response
at session i where
biopsies have been taken, and
is the probability that a true state
generates a biopsy in state k.
The no-false-positive restriction means that
. spiegelhalter:stovin:83
estimated the parameters
and p using the EM algorithm, with
some smoothing to avoid zero estimates.
The appropriate graph is shown in Figure 2, where the role of the true state
is simply to pick the appropriate row from the 4 x 4 error matrix e. Here the probability vectors
(j = 1,...,4) and p are assumed to have uniform priors on the unit simplex, which correspond to Dirichlet priors with all parameters being 1.
The BUGS code for this model is given below. No initial value file is provided, since the
forward sampling procedure will find a configuration of
starting values that is compatible with the expressed constraints.
It has been necessary, however, to introduce dummy arrays for
the separate rows of the error matrix since they have different
lengths. We also note the apparent ``cycle'' in the graph created by the expression nbiops[i] <- sum(biopsies[i,]). This will lead BUGS to generate the warning message Possible directed cycle or undirected link in model during compilation. Such ``cycles'' are permitted provided that they are only data transformation statements, since this does not affect the essential probability model.
Biops: model specification in BUGS
model biops;
const
ns=157; # number of sessions
var
biopsies[ns,4], # grades observed in ith session (multinomial)
nbiops[ns], # total number of biopsies in ith session
true[ns], # true state in ith session
error[4,4], # error matrix in taking biopsies
error2[2], # non-zero elements of error[2,]
error3[3], # non-zero elements of error[3,]
p[4], # underlying incidence of true states
prior2[2], # prior parameters for error2
prior3[3], # prior parameters for error3
prior4[4]; # prior on p and error[4,] (fixed in data-file)
data biopsies in "biops.dat";
{
# TRANSFORMATIONS
for (i in 1:ns){
nbiops[i] <- sum(biopsies[i,]);
}
# MODEL
for (i in 1:ns){
true[i] ~ dcat(p[]);
biopsies[i,] ~ dmulti(error[true[i],],nbiops[i]); # multinomial
}
# force no false positives
error[1,1] <- 1; error[1,2] <- 0; error[1,3] <- 0; error[1,4] <- 0;
error[2,3] <- 0; error[2,4] <- 0; error[3,4] <- 0;
# priors for parameters
prior2[1] <- 1; prior2[2] <- 1;
prior3[1] <- 1; prior3[2] <- 1; prior3[3] <- 1;
prior4[1] <- 1; prior4[2] <- 1; prior4[3] <- 1; prior4[4] <- 1;
error2[] ~ ddirch(prior2[]); for (j in 1:2) {error[2,j] <- error2[j]}
error3[] ~ ddirch(prior3[]); for (j in 1:3) {error[3,j] <- error3[j]}
error[4,] ~ ddirch(prior4[]);
p[] ~ ddirch(prior4[]); # prior for p
}
Figure 2:
Graphical model for biops example
Analysis
A simple BUGS run took 21 seconds for 1000 iterations and gave the following results which are similar to those obtained by the EM algorithm.