Abstract
Purpose :
Population frequencies (e.g. gnomAD) and variant pathogenicity scoring tools (e.g. CADD) have been widely used for variant filtration and prioritization in genetic diagnosis. However, their performances have never been tested in real patient cohorts. With the availability of over 3000 deep sequenced patients who had a range of different genetic diseases such as retinal dystrophy, epilepsy and bone marrow failure, and with varying ages of onset and inheritance modes, we aim to investigate whether different cutoffs should be considered under different circumstances.
Methods :
A novel algorithm, LI profiling, has been developed to investigate genotype-phenotype relationships. Specifically, phenotypes are recorded using Human Phenotype Ontology (HPO), and genotypes are defined by choosing varying cutoffs on population (gnomAD) frequencies and a variant pathogenicity score (CADD phred). LI profiling is then used to seek patterns of phenotype enrichment for different type of genetic diseases, such as early/late onset and recessive/dominant diseases.
Results :
Interesting patterns with important clinical relevance were revealed using LI profiling. CADD phred score, a common choice in predicting variant pathogenicity, is most effective against very early-onset diseases with de novo mutations (e.g. epilepsy caused by SCN1A). On the other hand, choosing stringent cutoffs on population frequency is effective against early onset diseases, regardless of inheritance mode (such as retinal dystrophy caused by CERKL or CRB1). By using this information, LI profiling is able to infer age of onset. Additionally, LI profiling is also able to accurately infer inheritance mode. For example, it predicts that retinal dystrophies due to mutations in ABCA4 or USH2A tend to pass down recessively, whilst GUCY2D and PROM1 are more likely to cause autosomal dominant retinal dystrophy.
Conclusions :
LI profiling is a useful tool to explore large patient databases for novel genotype-phenotype relationships. It also provides practical guidance of cutoff choices based on the type of diseases: e.g. early onset recessive diseases should use stringent population frequency cutoff for variant filtration, but should use variant pathogenicity prediction tools only as a reference for variant prioritization. For information not easily collected from clinics, such as inheritance modes, LI profiling can help produce reasonable estimates.
This is an abstract that was submitted for the 2018 ARVO Annual Meeting, held in Honolulu, Hawaii, April 29 - May 3, 2018.