whittemore:keller:88 use an approximate maximum likelihood approach to analyse the data shown below on reported respiratory illness versus exposure to nitrogen dioxide (NO
) in 103 children. stephens:dellaportas:92 later use Bayesian methods to analyse the same data.
A discrete covariate
(j = 1,2,3) representing NO
concentration in the child's bedroom classified into 3 categories is used as a surrogate for true exposure. The nature of the measurement error relationship associated with this covariate is known precisely via a calibration study, and is given by
where
= 4.48,
= 0.76 and
is a random element having normal distribution with zero mean and variance
= 81.14. Note that this is a berkson:50 model of measurement error, in which the true values of the covariate are expressed as a function of the observed values. Hence the measurement error is independent of the latter, but is correlated with the true underlying covariate values. In the present example, the observed covariate
takes values 10, 30 or 50 for j = 1, 2, or 3 respectively (i.e. the mid-point of each category), whilst
is interpreted as the ``true average value'' of NO
in group j. The response variable is binary, reflecting presence/absence of respiratory illness, and a logistic regression model is assumed. That is
where
is the probability of respiratory illness for children in the jth exposure group. The regression coefficients
and
are given vague independent normal priors. The graphical model is shown in Figure 5.
Figure 5:
Graphical model for air example
Model specification for air example
model air;
const
alpha = 4.48, # intercept of measurement error model
beta = 0.76, # slope of measurement error model
sigma2 = 81.14, # error variance of measurement error model
J = 3; # number of exposure levels for covariate
var
theta[2],X[J],Z[J],mu[J],p[J],y[J],n[J],tau;
data y, n, Z in "air.dat";
inits in "air.in";
{
theta[1] ~ dnorm(0.0,1.0E-3);
theta[2] ~ dnorm(0.0,1.0E-3);
tau <- 1/sigma2;
for (j in 1:J) {
mu[j] <- alpha + beta*Z[j];
X[j] ~ dnorm(mu[j],tau);
logit(p[j]) <- theta[1] + theta[2]*X[j];
y[j] ~ dbin(p[j],n[j]);
}
}
Analysis
2000 iterations took 8 seconds after a 500 iteration burn-in, and produced the following output
These results should be compared with the plots shown by stephens:dellaportas:92. The posterior mean for
is also similar to that obtained by whittemore:keller:88, although their maximum likelihood analysis yielded considerably smaller standard errors. In addition, note that the posterior mean estimates for the elements of
and
(the ``true average exposure'' to NO
in the low and medium groups) are close to the ``prior'' values of 10 and 30 selected by Whittemore and Keller. However, the value of
is somewhat lower than its ``prior value'' of 50, largely because the posterior estimate is ``pulled in'' by the need to fulfil the linear logistic model assumption.