Purchase this article with an account.
Edwin M. Stone, Adam DeLuca, Todd E. Scheetz, Terry A. Braun, Louisa M. Affatigato, Heather T. Daggett, Rebecca M. Johnston, Megan R. Streb, Val C. Sheffield; Analysis of 200 Human Exomes for Improved Mutation Detection Specificity. Invest. Ophthalmol. Vis. Sci. 2011;52(14):3314.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
To use published exome sequence data from 200 individuals to better identify non-disease-causing polymorphisms in next generation sequencing experiments. The reference human genome sequence is based upon the analysis of a small number of individuals and thus many nucleotides in the reference will differ from the consensus sequence of a given population. Such variants appear as putative mutations when the exome of a research subject is compared against the reference. The lack of uniform allele frequency annotation in dbSNP, and the fact that a large number of true disease-causing variations are actually present in that database, limit the utility of dbSNP as the sole filter of non-disease-causing polymorphisms, especially for autosomal recessive disease.
The exomes of eight patients with autosomal recessive retinitis pigmentosa were sequenced using Agilent liquid phase fragment capture and SOLiD sequencing. The resulting sequences were aligned and compared to the reference human sequence (hg19). The data from two hundred exomes previously published by Li et al. (2010) were also analyzed using the same algorithms. Variants with frequencies greater than 2% in dbSNP, the 1000 genomes project or the 200 exomes dataset were considered non-disease-causing.
An average of 22.3MB of exome and flanking sequence were obtained at 50X or greater coverage from each of the eight RP patients. These individuals harbored an average of 19758 sequence variations when compared to hg19. An average of 4918 of these were predicted to alter the amino acid sequence of the encoded proteins or RNA splicing. Comparison of these variants with those in dbSNP, the 100 genomes database and the 200 exome dataset allowed an average of 92.3% to be recognized as non-disease-causing polymorphisms. For recessive RP patients without affected relatives, the addition of the 200 exome filter reduced the number of putative disease genes requiring additional validation by an average of 44 genes (range 16 to 60).
The use of published exome data from 200 individuals as an additional filtering criterion results in a 15% improvement in the recognition of non-disease-causing polymorphisms.
This PDF is available to Subscribers Only