next up previous contents
Next: Learning about parameters Up: Asia: a simple expert Previous: Asia: a simple expert

Evidence propagation

lauritzen:spiegelhalter:88 introduce a fictitious ``expert system" representing the diagnosis of a patient presenting to a chest clinic, having just come back from a trip to Asia and showing dyspnoea (shortness-of-breath). A graphical model for the underlying process is shown in the Figure 23, where each variable is binary. The BUGS code is shown below and the conditional probabilities used are given in lauritzen:spiegelhalter:88.

   figure1828
Figure 23: Graphical model for asia example

Asia: model specification in BUGS {

model Asia;
var
   asia,smoking,tuberculosis,lung.cancer,bronchitis,either,xray,dyspnoea,
   p.asia[2],p.smoking[2],p.tuberculosis[2,2],p.bronchitis[2,2],
   p.lung.cancer[2,2],p.xray[2,2],p.dyspnoea[2,2,2];
data in "asia.dat";
{
   smoking      ~ dcat(p.smoking[]);
   tuberculosis ~ dcat(p.tuberculosis[asia,]);
   lung.cancer  ~ dcat(p.lung.cancer[smoking,]);
   bronchitis   ~ dcat(p.bronchitis[smoking,]);
   either      <- max(tuberculosis,lung.cancer);
   xray         ~ dcat(p.xray[either,]);
   dyspnoea     ~ dcat(p.dyspnoea[either,bronchitis,])
}

Note the use of max to do the logical-or. All initial values are computed by forward sampling so no initial value file is necessary. The dcat distribution is used to sample values with domain (1,2) with probability distribution given by the relevant entries in the conditional probability tables. The S-Plus format has been used for the data file, since these conditional probability tables are of different dimensions, and would require 4 separate data files in rectangular format.

Data in S-Plus format for asia example

list(asia = 2, dyspnoea = 2,
     p.asia         = c(0.99, 0.01),
     p.tuberculosis = c(0.99, 0.01,
                        0.95, 0.05),
     p.bronchitis   = c(0.70, 0.30,
                        0.40, 0.60),
     p.smoking      = c(0.50, 0.50),
     p.lung.cancer  = c(0.99, 0.01,
                        0.90, 0.10),
     p.xray         = c(0.95, 0.05,
                        0.02, 0.98),
     p.dyspnoea      = c(0.9, 0.1,
                         0.2, 0.8,
                         0.3, 0.7,
                         0.1, 0.9)
)

The observed features (asia and dyspnoea) are given value 2 in the data-file. 100000 iterations (31 seconds) gave the following posterior probabilities (the exact values are given in brackets): smoking .625 (626), tuberculosis .089 (.088), lung cancer .099 (.100), bronchitis .810 (.812), either .183 (.182) x-ray .220 (.220). Note that these probabilities are obtained by subtracting 1 from the posterior means of the variables smoking, tuberculosis etc. which are actually defined on the domain (1,2).



Daniel Farewell
Mon Sep 13 16:39:37 BST 1999