Download Full Text (452 KB)

Publication Date



We are embarking on the systematic assessment of all exome negative rare disease families consenting to our large-scale genomic program “Genomic Answers for Kids” (GA4K). GA4K aims to collect genomic data and health information from 30,000 children and their families. As it has become clear in recent years, the true challenge lies in the interpretation of such data. With this in mind, GA4K is employing an advanced set of genomic tools to uncover a greater number of candidate variants in these patients. These tools include novel sequencing technologies to test beyond the exome as well as machine-learning analysis methods. We combined two publicly available tools to aid with variant prioritization: Exomiser (Smedley et al. 2015; PMID: 26562621) and AMELIE (Birgmeier et al. 2020; PMID: 32434849). Both tools (E/A) rely on structured phenotyping (with HPO terms) but apply algorithms that explore different features of the variants/genes. Therefore, we hypothesized that combining them would improve speed and accuracy of our analysis of genomic data.

A review of the combined top 50 ranked E/A candidate variants for each proband was carried out for the first 1090 cases. Bioinformatically selected variants were then manually prioritized based on multiple criteria (zygosity, segregation, population frequency, gene function, etc.). In ~40% of cases, the top variants selected from the combined E/A files were consistent with those identified by lengthy expert review. In addition, no strong E/A candidates were identified in ~6% of cases which were positive for a variant that would not have been annotated by these tools (such as copy number variants, deep intronic variants, structural variants, repeat expansions, etc). Moreover, ~34% of cases were deemed negative by both expert analysis and combined E/A ranking, giving us an overall consistency of 80%. Finally, in ~8% of cases, these tools pointed us towards new candidates that may not have otherwise been considered. Thus, combined analytical approaches using Exomiser and AMELIE are effective for variant and disease gene prioritization and automated review of diagnostic interpretation. Furthermore, this ranking method also helped prioritize among genes of unknown significance that should be pursued for further investigations (e.g. through collaborations established on GeneMatcher, other testing technologies, or other modelling), and also provides a means to communicate larger sets of ranked candidate variants. Negative cases are further investigated using PacBio long-read sequencing, along with whole genome bisulphite sequencing and single cell analysis (scATAC/scRNA sequencing).

Document Type


Experience Using A Combination Of Variant Prioritization Tools In A Large Rare Disease Cohort