June 2013
Volume 54, Issue 15
Free
ARVO Annual Meeting Abstract  |   June 2013
Detection of sample contamination in clinical next-generation sequencing
Author Affiliations & Notes
  • Todd Scheetz
    Ophthalmology, University of Iowa, Iowa City, IA
    Biomedical Engineering, University of Iowa, Iowa City, IA
  • Adam DeLuca
    Biomedical Engineering, University of Iowa, Iowa City, IA
  • Edwin Stone
    Ophthalmology, University of Iowa, Iowa City, IA
  • Terry Braun
    Ophthalmology, University of Iowa, Iowa City, IA
    Biomedical Engineering, University of Iowa, Iowa City, IA
  • Footnotes
    Commercial Relationships Todd Scheetz, None; Adam DeLuca, None; Edwin Stone, None; Terry Braun, Alcon Research, LTD (F)
  • Footnotes
    Support None
Investigative Ophthalmology & Visual Science June 2013, Vol.54, 3378. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Todd Scheetz, Adam DeLuca, Edwin Stone, Terry Braun; Detection of sample contamination in clinical next-generation sequencing. Invest. Ophthalmol. Vis. Sci. 2013;54(15):3378.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract
 
Purpose
 

To detect contamination of genomic DNA samples used in next-generation sequencing applications. To take advantage of the abundance of third party sequencing solutions, it is important to be able to ensure that any variations detected result from the correct patient sample. Known genotype fingerprints can help validate sample identity. But additional quantitative measures are required to ensure sample integrity.

 
Methods
 

The Genome Analysis Toolkit (GATK) from the Broad Institute was used to call variations. The relative number of supporting reads (supporting / cover) was calculated for each variation. The distribution of the relative number of reads supporting each variant was compared to a distribution derived from a cohort of control samples. Contamination was detected as an increase in variations with a relative number of supporting reads below 35%.

 
Results
 

We have developed and implemented a systematic approach for identifying contamination in samples used in next-generation sequencing experiments. The distribution of relative supporting reads for a few dozen exomes is shown in Figure 1 below. Non-contaminated samples are shown with solid black lines. Contamination presents as a substantial and distinctive increase in the fraction of variations found below 50%. Two samples (large dashed lines) are clearly contaminated, and two other samples (small dashed lines) exhibit an indication of potential contaminated. We are actively evaluating exomes from several large whole-exome sequencing projects. Together with our collaborators we will be validating samples that appear contaminated to evaluate our algorithm’s specificity and sensitivity.

 
Conclusions
 

We have developed a simple method for identifying contaminated samples in exome sequencing experiments. Further research in this area is needed to determine the power of this method in identifying and quantifying the extent of contamination, and the amount of contamination that can be tolerated without compromising accuracy.

 
 
The distribution of variations by the fraction of reads supporting the variation. Heterozygous variations are expected at 50%, and homozygous variations at 100%.
 
The distribution of variations by the fraction of reads supporting the variation. Heterozygous variations are expected at 50%, and homozygous variations at 100%.
 
Keywords: 604 mutations • 467 clinical laboratory testing • 473 computational modeling  
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×