Abstract
Purpose :
Better delineation of which gene-specific variants in inherited retinal diseases (IRDs) cause disease may help guide diagnostic and prognostic information for subjects afflicted with IRDs. To address this, we generated a database of variants and their associated deleteriousness. We then used it to examine long-read genome sequencing (LRS) data of IRD patients with missing heritability to demonstrate that publicly available databases and scoring algorithms can assist in identification of previously unknown pathologic variants.
Methods :
Variant details such as allele frequency, ClinVar classifications, and variant annotation were extracted for a curated list of 373 genes implicated in IRDs from the Genome Aggregation database (gnomAD). The variant files were scored using the Combined Annotation Dependent Depletion (CADD) algorithm to rank the deleteriousness of each variant. Targeted LRS data of IRD patients were then analyzed using extracted features of variants to provide a ranked list for disease-variant discovery.
Results :
A total of 7,855,922 single nucleotide and insertion/deletion variants across 373 genes were extracted from gnomAD. Greater than 97% of these variants (7,647,452) resided in non-coding regions and only 208,470 were in coding regions. Only 5,331 of all the variants were implicated as potentially pathogenic by ClinVar. The CADD score ranking indicated that 98% of these variants had scores of 15 or higher. For those above that threshold only 16% were non-coding. Using the established analysis pipeline, 3 previously unsolved IRD cases were evaluated. In 2 cases, variants of uncertain significance were able to be re-evaluated to better establish pathogenicity in cases where prior clinical exome-based panel testing revealed monoallelic variants in autosomal recessive genes (PDE6B and ABCA4). In the third case, an IRD patient without prior genetic testing, a pathogenic variant in CNGA1 was identified and haplotagged ranking of additional variants narrowed down the second variant as a likely pathogenic non-coding variant.
Conclusions :
Generation of a database of ranked genomic variants that reside in known disease-causing IRD genes allowed prioritization of whole genome sequencing data to narrow down potential pathogenic variants from large scale LRS data.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.