A team of experts from the MRC Biostatistics Unit at the University of Cambridge (Jennifer Asimit, Feng Zhou), University of Manchester (Andrew Morris), University of Exeter (Inês Barroso), MRC Uganda/LSHTM (Opeyemi Soremekun, Segun Fatumo), and Harvard University / University of the Witwatersrand (Tinashe Chikowore) have developed the first multi-trait multi-group fine-mapping method, published in Nature Communications.
Finding new therapeutic targets or revealing new biological insights for diseases is guided by the construction of shortlists of causal variants that likely underlie identified genetic associations with diseases and their related traits. Aided by statistical fine-mapping, shortlisting of potential causal variants is a crucial step in deciding which variants to follow-up in downstream functional validation experiments.
Since biologically related traits often share some causal variants, borrowing information between traits through joint fine-mapping could further refine such shortlists, which reduces the number of genetic variants for experimental follow-up. These shortlists could also be refined by joint analyses of multiple population groups, by taking advantage of the differences in genetic structures between groups. Until now, no statistical methods for prioritising variants have been developed that leverage both multiple traits and multiple population groups.
This new statistical method, called MGflashfm (multi-group flexible and shared information fine-mapping), uses summary-level data to jointly fine-map signals from multiple traits and multiple population groups. It allows missing trait measurements and for traits to have different causal variants. Crucially, it does not restrict analysis to genetic variants that are present in all population groups, as most multi-group methods do. This means that a genetic variant that is very rare in some groups but has an impact on a trait in a subset of groups, will still be prioritised by MGflashfm.
The researchers used MGflashfm to prioritise causal variants for four lipids traits (e.g. LDL cholesterol) in five population groups (1.65 million people in total from the Global Lipids Genetics Consortium). As their new method allows for multiple causal variants and for variants that are not present in all population groups, they prioritise causal variants that appear to be jointly causal and/or not present in a subset of the population groups – not detectable by current multi-group approaches.
Lead author and Senior Research Associate at the BSU, Jennifer Asimit, said:
By removing common restrictions on the input data and taking advantage of the prioritisation boosts from sharing information between traits and from differences in genetic patterns between populations, we developed MGflashfm to better pinpoint variants shared between a subset of cohorts.”
Jennifer Asimit
MGflashfm only uses summary-level genome-wide association study (GWAS) data, making it a very useful fine-mapping tool in consortia efforts where individual-level data cannot be shared. It is computationally efficient and freely available as an R Library at https://jennasimit.github.io/MGflashfm.
This research is funded in whole or in part by the MRC, Wellcome Trust, and Research England.
Read full paper published in Nature Communications: F Zhou, O Soremekun, T Chikowore, S Fatumo, I Barroso, AP Morris, JL Asimit. (2023). Leveraging information between multiple population groups and traits improves fine-mapping resolution. Nature Communications 14, 7279 https://doi.org/10.1038/s41467-023-43159-5