Many genetic variants have been identified as associated with disease-related traits, but the most significant genetic variant often does not influence these traits – rather, it is linked with a genetic variant that plays a role in the traits, which is the underlying causal variant. Identifying the causal variants which underlie genetic associations is key to translating these findings into new therapeutic targets or revealing new biological insights for diseases. Selecting potential causal variants is a crucial step in deciding which variants to follow-up in downstream functional validation experiments. Statistical fine-mapping is an approach to identify potential causal variants – the goal is to refine these sets of potential causal variants to reduce the number of genetic variants for experimental follow-up. This refining of sets is often referred to as improving the resolution. As biologically related traits often have shared causal variants, joint fine-mapping that shares information between traits will be more accurate than each trait alone.
A new statistical method, called flexible and shared information fine-mapping (flashfm), published in Nature Communications, uses summary-level data to jointly fine-map signals from multiple traits, allowing for missing trait measurements and familial relatedness within the study. Flashfm is shown to have improved fine-mapping resolution over current approaches, and it allows traits to have different causal variants – when there are no shared causal variants it returns similar results to single-trait analyses.
Researchers from the MRC Biostatistics Unit at the University of Cambridge (Jennifer Asimit, Nicolas Hernandez, Paul Newcombe, Chris Wallace), University of Exeter (Inês Barroso, Jana Soenksen), and Imperial College London (Manj Sandhu) developed flashfm with motivation from the largest genome-wide association study (GWAS) of a single African population (Uganda). The Uganda GWAS includes measurements of 33 cardiometabolic traits (e.g. cholesterol levels), with missing trait measurements for some individuals and nearly half of the individuals being at least second degree related. Considering these challenges, flashfm was developed to be flexible, allowing missing trait values and familial relatedness, and to share information between traits. In the analysis of the Uganda data, flashfm gave a 20% reduction in the total number of potential causal variants from single-trait fine-mapping. This reduces the number of genetic variants that will be considered for experimental follow-up to better understand traits related to cardiometabolic disease susceptibility.
To illustrate the reduction in the number of potential causal variants, we give an example with simulated data for two traits, where we know the true causal variants that impact the traits. Here, traits 1 and 2 each have two causal variants, of which one is shared (blue) and the second causal variant is dark green (trait 1) or light green (trait 2) – see figure. Fine-mapping of each trait separately correctly finds that there are two sets of potential causal variants for each trait and that one set is common to both traits – the common set has 20 genetic variants and the second set of variants has 7 (trait 1) or 3 (trait 2) variants. With flashfm applied to the two traits together, the set of 20 variants for the blue set is reduced to 7 variants. For the variants that are not common to both traits, flashfm identifies the same sets of potential causal variants as independent fine-mapping, as there is no gain of information for these variants that do not impact both traits. The sets constructed under both methods have been confirmed to contain the true causal variants. This shows that flashfm greatly reduces the number of potential causal variants for experimental follow-up, compared to independent analyses, while retaining the true causal variants.
Flashfm uses the same information as single-trait approaches (GWAS summary statistics, SNP correlation matrix), as well as the trait correlation matrix and results from single-trait fine-mapping. This new approach to multi-trait fine-mapping is computationally efficient and freely available as an R library at https://jennasimit.github.io/flashfm/.
This research is funded in whole or in part by the MRC, Wellcome Trust, NIHR Cambridge BRC, and Research England.
N Hernandez, J Soenksen, P Newcombe, M Sandhu, I Barroso, C Wallace, JL Asimit. (2021) The Flashfm Approach for Fine-mapping Multiple Quantitative Traits. Nature Communications, 10.1038/s41467-021-26364-y