The increasing sophistication of high-throughput molecular technologies offers exciting possibilities for systems-level analysis of biological systems. Yet such novel and diverse data are frequently accompanied by significant statistical and computational challenges.
The Workshop on Statistical Systems Biology organised by the Centre for Research in Statistical Methodology at the University of Warwick and held between the 9th – 11th December 2014 aimed to bring together researchers operating at the interface of systems biology, mathematical modelling and statistics, in order to better leverage emerging data using state-of-the-art statistical methodologies.
Sach Mukherjee, Programme Leader at the MRC Biostatistics Unit was among the Keynote Speakers, including:
- Nigel Burroughs, Mathematics Institute and Systems Biology Centre, Warwick.
- Ramon Grima, Reader in Stochastic Systems Biology, Edinburgh..
- Mustafa Khammash, Control Theory and Systems Biology, ETH Zurich.
- Walter Kolch, Director of Systems Biology Ireland and Director of Conway Institute, University College Dublin.
- John Lygeros, Head of the Automatic Control Laboratory, ETH Zurich.
- David Rand, Director of the Systems Biology Centre, Warwick.
- Darren Wilkinson, School of Mathematics and Statistics, Newcastle.
Steven Hill, an Investigator Statistician within Sach Mukherjee’s group at the MRC Biostatistics Unit also gave a contributed talk.
Sach’s talk particularly focused on empirical assessment of causal network inference and his team’s work on the DREAM challenge*. Steven’s talk focused on statistical and computational methods for causal network inference and show results from applying these methods to infer protein signalling networks in breast cancer (this is work in collaboration with biologists at Oregon Health and Science university).
Title: Towards empirical assessment of causal inference
Dr. Sach Mukherjee – MRC Biostatistics Unit
Abstract: Sophisticated computational and statistical methods are routinely used to make inferences about the edge structure of molecular networks. Molecular networks are often intended to encode causal relationships between variables and then the object of inference is in effect a causal graph. It is well known from the causal inference literature that strong – and possibly untestable – assumptions are needed to justify causal inference from first principles. Furthermore, causal inference can easily be led astray by factors such as unobserved confounders and certain additional factors specific to systems biology may exacerbate these concerns. How then can we tell whether network learning methods are really effective in a given setting? I will discuss our recent efforts to develop empirical approaches by which to assess causal network learning using experimental data. These approaches were used in the 2013 DREAM network inference challenge and I will use data and results from the challenge to illustrate the key ideas.
Title: Data-driven inference of causal molecular networks and systematic assessment of inference performance
Steven Hill, Sach Mukherjee, Nicole K. Nesser, Paul T. Spellman
MRC Biostatistics Unit, Oregon Health and Science University
Abstract: Causal interplay between molecular components is central to regulation of cellular behaviour, and it is increasingly clear that molecular networks may depend on biological context, such as cell type or disease state. Therefore, in conjunction with appropriate experimental designs, there is a need for robust and scalable statistical approaches for inference of context-specific, causal networks. In this work we focus on protein signalling networks in breast cancer and utilise directed graphical models known as dynamic Bayesian networks (DBNs) to infer networks from time course proteomics data with interventions on network nodes. Our approach models the intervention conditions in the data, allows for integration of existing biology within a Bayesian framework, and exploits a connection between variable selection and network inference to enable exact, yet efficient, calculation of posterior probabilities of interest. Due to the challenging nature of causal network inference, it is necessary to empirically assess the ability of methods to recover causal relationships. Methods are often assessed using simulated data, where a goldstandard causal network structure is available. However, for real-world systems such as that under study here, there is no gold-standard available. We propose an approach that leverages the interventional data to perform systematic assessment of inferred causal networks and use this approach to empirically test our analyses. Furthermore, we see evidence that signalling network structure does indeed depend on biological context.
About the DREAM Challenge
Working with a team of colleagues from the UK, Europe and the US, members of Sach Mukherjee’s research team helped to organise the 2013 HPN-DREAM network inference challenge, aimed at assessing computational methods for learning causal networks, focussing in particular on molecular networks called protein signalling networks (see Figure 1b).
“The DREAM project (Dialogue for Reverse Engineering and Assessment of Methods) organises competitive – yet collaborative – challenges that pose questions relevant to biomedicine and invite researchers from around the world to come up with computational methods to address the question. Such challenges are a great way to focus the research community’s attention on a particular problem and to assess diverse methods, contributed by many different teams, in a careful and consistent fashion. In the HPN-DREAM challenge, participants were asked to infer causal protein signalling networks using data obtained from cancer cells (HPN, the Heritage Provider Network, was a sponsor of the challenge).
For the challenge, we developed an approach to assess computational methods for learning networks, focusing on the accuracy of the methods to discern causal relationships. This is challenging for the simple reason that the true underlying network is typically unknown in disease biology. We therefore developed an approach that used additional so-called “test” data (from experiments that were not in the data originally provided to challenge participants) to empirically test causal predictions derived from the participants’ networks.” Steven Hill
For further details about the DREAM challenge please visit: