The Society organises an annual international conference each summer in the United Kingdom. The 2014 Conference took place in Sheffield from 1-4 September – for full details visit the conference blog.
The Conference used three of Sheffield’s illustrious venues in the heart of the city with the main event taking place at City Hall, the welcome reception in the Winter Garden and the conference dinner at Cutlers’ Hall.
“The RSS Conference provides a unique opportunity in the UK for statisticians and users of statistics of all ages and professional backgrounds to gather and exchange knowledge and experiences, whether in the formal conference sessions or in the many opportunities for networking at refreshment breaks or at evening social events.
A strength of the conference is the breadth and variety of its programme of talks and workshops with sessions appealing to both theorectical and applied statisticians, those working in the areas of official, medical, environmental statistics (amongst many others), people working with data more generally or indeed those with a general interest in the topic.” (Taken from the RSS website)
Several senior statisticians from the MRC Biostatistics Unit participated in the Contributed Sessions on Medical/bioinformatics, Meta-analysis and Bayesian methods. The Conference attracted over 400 participants ranging from senior academic statisticians through to new graduates and postgraduate students, with strong representation from the public sector as well as statisticians working in industry or as independent consultants.
The 2015 International Conference is scheduled to take place at the University of Exeter from 7-10 September 2015.
For further details visit the RSS2014 Conference webpage.
[toggle tag=”h2″ title=”Contributed sessions – Medical/bioinformatics”]
Topics: Bioinformatics, Genomics & Biostatistics
Approaches for parametric time-to- event analysis with informative entry times, with application to an HCV study of time to cirrhosis from infection
Brian Tom, Vernon Farewell, Sheila Bird
MRC Biostatistics Unit, Cambridge, UK
In this talk, we examine maximum and pseudo score approaches for the analysis of prevalence data arising from a referral cohort where entry into the cohort is dependent on a subject’s residual fraction of time remaining to the event of interest, and inference on the incident population is required. Such data are believed to occur in hepatitis C virus (HCV) studies conducted in tertiary care settings, where HCV patients are more likely to be referred to specialist clinics at later stages of disease. The conventional truncation likelihood approach which simply conditions on the time of entry into the cohort does not work here as the referral time and time to the event are correlated. The ignoring of this referral bias has led to higher rates of progression to cirrhosis being reported in studies in specialist clinics compared to those in community-based settings. As cirrhosis linked to HCV infection is a major epidemic of the 21st century, it is therefore extremely important to get an accurate picture of the present and future disease burden facing affected regions in order to inform public health decisions and actions.
Topics: Medical, clinical trials and epidemiology
Accounting for nonignorable missingness: A simulation study comparing method performance under model misspecification
Finbarr Leacy1,2, Ian White1, Sian Floyd3, Tom Yates4
1MRC Biostatistics Unit, Cambridge, UK,
2University of Cambridge, Cambridge, UK,
3Department of Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene & Tropical Medicine, London, UK,
4Research Department of Infection and Population Health, University College London, London, UK
Objectives: This work seeks to compare the performance of maximum likelihood, Bayesian full probability modelling and multiple imputation approaches for handling no nignorable missingness in a single covariate under misspecification of the missingness and/or covariate models and assuming that the analysis model is a generalised linear model. It further aims to provide a framework for structured sensitivity analysis in this context.
Methods: We present results from a simulation study comparing the performance of the following methods: EM by the Method of Weights, Bayesian full probability modelling, multiple imputation with delta-adjustment and multiple imputation with re-weighting. We consider settings in which the covariate and missingness models are both correctly specified as well as settings in which at least one of these models is incorrectly specified. We assume that the analysis model is always correctly specified. We illustrate the methods using data from the Zambia South Africa Tuberculosis and AIDS Reduction trial, exploring the impact on inference of nonignorable missingness in the HIV test result variable across a range of sensitivity analyses.
Results: Misspecification of the covariate or missingness models can be associated with significant bias in estimates of the analysis model coefficients. Sensitivity analyses based on misspecified models can produce inferences that are inconsistent with those obtained under the true model.
Conclusions: As the true missingness mechanism cannot be determined on the basis of the observed data alone, sensitivity analyses that consider possible misspecification of the covariate and/or missingness models should form a central component of all practical analyses of incomplete data.
[toggle tag=”h2″ title=”Contributed sessions – Meta-analysis”]
Topic: Medical, clinical trials and epidemiology
Bayesian meta-analysis without MCMC
Kirsty Rhodes1, Rebecca Turner1, Dan Jackson1, Julian Higgins2,3
1MRC Biostatistics Unit, Cambridge, UK, 2University of Bristol, Bristol, UK, 3University of York, York, UK
Background: Many meta-analyses combine results from only a small number of studies, a situation in which between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest.
Methods: We propose two methods for performing Bayesian meta-analysis, using data augmentation and importance sampling techniques. Both methods are implemented in standard statistical software and provide much less complex alternatives to Markov chain Monte Carlo (MCMC) approaches. In a simulation study, we compare the performance of the proposed methods.
Results: An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through data augmentation are very similar. We compare the methods formally and also apply them to real datasets. For example, a meta-analysis combining four studies evaluating the effectiveness of fluoride for lower limb pain is considered. In a conventional random-effects meta-analysis, the between-study variance τ2 is high at 1.78, but very imprecisely estimated (95% CI 0.39 to 52.2). The estimated summary odds ratio is 4.14 (95% CI 0.92 to 18.4). When incorporating an informative inverse-gamma prior for τ2 using importance sampling, the heterogeneity estimate reduces to 0.54, with 95% credible interval 0.04 to 5.33. The summary odds ratio changes to 3.46 (95% CI 1.17 to 14.3).
Conclusion: The proposed methods facilitate Bayesian meta-analysis in a way that is accessible to the applied researchers who are commonly carrying out meta-analyses.
Topic: Medical, clinical trials and epidemiology
Incorporating external information on between-study heterogeneity in network meta- analysis
Rebecca Turner, Kirsty Rhodes, Dan Jackson, Ian White
MRC Biostatistics Unit, Cambridge, UK
Objectives: In a network meta-analysis comparing multiple treatments, between-study heterogeneity variances are often very imprecisely estimated because data are sparse, and so standard errors can be highly unstable. External evidence obtained from modelling a large database of meta-analyses can provide informative prior distributions for heterogeneity, tailored to particular settings. Our objectives are to explore and compare approaches for specifying informative priors for multiple heterogeneity variances in network meta-analysis.
Methods: In the simplest model assuming equal heterogeneity variances, it is straightforward to construct an informative prior for the common variance. Models allowing heterogeneity variances to be unequal are more realistic; however, care must be taken to ensure that implied variance-covariance matrices remain positive semi-definite, as discussed by Lu and Ades (2009). We consider several strategies for specifying informative priors for multiple heterogeneity variances: proportional relationships among the variances; exchangeability across similar treatment comparisons; or separate priors.
Results: Appropriate prior distributions are obtained through modelling empirical data from the Cochrane Database of Systematic Reviews. The models are applied to a network meta- analysis comparing eight treatments for localised prostate cancer. For example, the odds ratio comparing surgery against standard care is estimated as 0.78 (95%CrI 0.35,1.64) when using vague priors. This changes to 0.79 (95%CrI 0.62, 1.01) when specifying informative priors assuming proportional heterogeneity variances, or to 0.78 (95%CrI 0.54, 1.10) when assuming exchangeability of variances across similar treatment comparisons.
Conclusions: It is possible to incorporate relevant prior information on heterogeneity into network meta- analyses, without making unrealistic assumptions. This may improve precision for estimating treatment differences.
[toggle tag=”h2″ title=”Contributed sessions – Bayesian Methods”]
Topics: Statistical methods and theory
Sample size and classification error for Bayesian change-point models with unlabelled sub-groups and incomplete follow-up
Simon White1, Graciela Muniz-Terrera2, Fiona Matthews1
1MRC Biostatistics Unit, Cambridge, UK,
2MRC Unit for Lifelong Health and Ageing, London, UK
Summary: Many medical (and ecological) processes involve the change of shape whereby one trajectory changes into another trajectory at a specific time point. There has been little investigation as to the study design needed to investigate these models.
We consider the class of fixed effect change-point models with an underlying shape comprising two joined linear segments, also known as broken-stick models. We extend the model to include two sub-groups that exhibit a different shift at the change-point, a change and no change class, and a missingness model leading to individuals with incomplete follow- up.
Through a simulation study we consider the relationship of sample size to the estimates of the underlying shape, the existence of a change-point, and the classification-error of sub- group labels.
In summary the estimation of a fixed change-point was acceptable with relatively small numbers of individuals (150) post the change-point, with initial sample sizes of two to five hundred.
[toggle tag=”h2″ title=”Poster presentation”]
An approach for summarising the association of multiple correlated features
Marina Evangelou1, John Todd1, Chris Wallace1,2
1University of Cambridge, Cambridge Institute for Medical Research, Cambridge, UK
2MRC- Biostatistics Unit, Cambridge, UK
Abstract: Summarizing the association of multiple correlated features with a single response variable is a commonly faced challenge in the area of statistical genomics. Several methods have been proposed for combining the association of multiple independent features into a single statistic. For example the Fisher’s product method is a powerful method, but the null distribution of its statistic does not hold for correlated features. Permutation procedures are usually employed for finding the null distribution of the chosen statistic which can be computationally intensive and require access to the raw data, which are not always available.
We have adapted an alternative method for finding the null distribution of the chosen statistic. The correlation structure of the tested features is considered as the covariance matrix of a multivariate Normal distribution. Z-scores for these features are then drawn from this distribution, P-values are subsequently calculated and the chosen statistic is re- computed using these simulated P-values.
We have explored this approach in the setting of a genomewide association study, where the association of a gene is found by combining the association of correlated SNPs located near the gene with the phenotype of interest. We demonstrate a very high correspondence between the results found through permutation and our proposed approach (Spearman correlation rho>=0.90). Alternative areas of application include gene expression experiments where the interest is the association of modules of correlated genes with the phenotype. We compare the simulation approach to established alternatives based on generating a univariate summary for each module, usually through principal components.