Twelve of the OpenBUGS examples have been successfully validated using the Cook-Gelman-Rubin (CGR) testing method (Cook, S., Gelman, A., and Rubin, D. (2006). “Validation of Software for Bayesian Models Using Posterior Quantiles “, Journal of Computational and Graphical Statistics, 15, 675-690)
Method
The CGR method is applied to the model and prior distribution from an OpenBUGS example, but does not use the actual data from an example. The CGR approach is implemented by writing independent code, (we use R), to simulate the model parameters from their prior distribution. Conditional on the model parameters, a data set is simulated, also using R. The simulated data have sample size, missing data, etc. matching the real data. The two-step simulation (parameters, then data given parameters) is mathematically equivalent to simulating the data from its marginal distribution, and then simulating a single set of parameters from the distribution of the parameters given the data.
We can also create many simulations from the posterior distribution of the parameters given the simulated data set, which is what OpenBUGS does. Thus, the parameters simulated from the R code should be a single random sample from the simulated posterior distribution created by OpenBUGS if everything (OpenBUGS and our R code) is working correctly.
Using a single simulated random parameter to test for consistency with the OpenBUGS-derived posterior distribution yields a very weakly powered test for correctness of the code. Thus the sampling process is repeated many times (currently 40). The percentile of each R-simulated parameter is computed from the corresponding OpenBUGS posterior distribution. Percentiles (i.e., p-values) to follow the uniform (0,1) distribution, so our testing problem becomes one of testing whether the 40 observed percentiles follow a uniform (0,1) distribution. CGR recommend applying the inverse normal CDF to the percentiles to convert them to normal random variables, and then squaring each normal random variable and summing them. The resulting sum of squares should follow a chi square (df=40) distribution if everything is correct. CGR found that this test statistic was sensitive to many common computing mistakes.
The parameters in most examples are multivariate. We apply the CGR approach to each parameter individually. (This creates a substantial multiplicity problem when interpreting the tests). Once the R code to simulate the parameters and data has been created, the CGR approach is implemented using the R package BayesValidate (created by S. Cook).
Results for OpenBUGS 3.2.3
The p-value for each parameter from each example model/prior in is p-values. An overall graphical summary of the p-values from all of the examples was created by once again back-transforming the p-values by a inverse normal CDF to achieve approximate normality, to yield a normal QQ plot. Note 1) the QQ plot may be noisy as some of the p-values are correlated (same example, different parameters), and 2) there is a slight tendency for the observed (transformed) p-values to be too small, although the trend is much too slight to be noticed in any one example. We think this trend is due to unaccounted for noise in the percentile estimates from OpenBUGS, which are based on a finite number of simulations, rather than the exact percentiles assumed by the theory. Increasing the OpenBUGS burnin and subsequent sample sizes reduces the slight trend toward lower p-values.
Code for CGR testing
The R code and example files to execute the CGR testing are stored in the OpenBUGS SourceForge repository under the validation sub-directory. The code executing the OpenBUGS code is cgr.R. The results are stored in a .Rdata set and are post-processed by interactive execution of the code in cgrPost.R. These files should be executable with only a few changes to directory paths set at the beginning of the files. The code in cgr.R uses the R package BayesValidate, and the R function batchBUGS, which is stored in the validation subdirectory (batchBUGS.R). The default version of OpenBUGS in the batchBUGS parameter list should be updated for testing a new version.
The examples with supporting files are in the repository (cgr subdirectory of validation) in a directory called CookExamples. All of the OpenBUGS example files are included. The examples currently set up for CGR testing have several additional files added with matching example-name prefixes. Instructions to add CGR testing for another OpenBUGS example are in Instructions for a new example. Note especially that the diffuse prior distributions for many of the examples must be changed to more informative prior distributions to create simulated data sets similar to the real data sets.
(By Neal Thomas, Pfizer)