Abstract
Purpose:
Among all reported mutations underlying human inherited diseases, only a very small fraction is from noncoding regions, largely due to the limitations of current detection methods and current inability to predict and interpret the functional consequences of these mutations. We set out to systematically evaluate the contribution of noncoding mutations to the RD cohort(LCA, RP, STGD, USHER). A combination of statistics, bioinformatics and experimental approaches was used to identify and validate mutations missed by exon capture sequencing (ExonCapSeq).
Methods:
Copy number variations (CNV) were screened for in 32 patients by custom designed high density array comparative genomic hybridization (aCGH). Noncoding mutations were identified by custom designed genomic capture sequencing. New bioinformatics tools identified potential noncoding mutations predicted to affect gene regulation, transcription, or translation. Experimental systems were established to assess the accuracy of the predictions.
Results:
Analysis of this large cohort reveals that the number of patients carrying single exonic mutations in known recessive RD genes is up to 5 times higher than in controls. Particularly, 8 genes were found to be enriched (p< 0.05) for single hits and an additional 10 genes were suggestive of enrichment. No pathogenic CNVs were found. Analysis of the complete genomic sequences of these 18 genes in 129 patients resulted in identification of potential pathogenic noncoding mutations for 38 patients in 8 genes. 16 are splice mutations and 4 appear to create new miRNA binding. The remaining 18 noncoding mutations occurred multiple times in patients but have never been observed in controls, allowing a statistical argument for their disease association. To validate the splice site mutations, RNA experiments have been conducted and 4 of the 5 mutations tested were confirmed to alter gene splicing. Functional validation of additional noncoding mutations is currently underway.
Conclusions:
Leveraging on the large patient cohort, our study systematically evaluated the disease contribution of mutations that are undetectable by ExonCapSeq. We found that CNVs are likely a rare cause of RD. In contrast, mutations in noncoding regions can contribute to the inheritance of RD. Therefore, with the advent of WGS, it is increasingly important and feasible to annotate mutations in intronic regions.