Blood samples were whole exome sequenced on the Illumina HiSeq 3000 (Illumina, San Diego, CA, USA) at the University of Miami John P. Hussman Institute for Human Genomics. Base calls were determined by the Illumina CASAVA pipeline. After filtering for base quality and adapter sequences, the sequencing reads were aligned to the human reference genome (hg19) using bowtie2 and formatted for input into the Genome Analysis Toolkit (GATK) from the Broad Institute (Cambridge, MA, USA;
https://software.broadinstitute.org/gatk/). The pipeline is based on the Broad Institutes' Best Practices Guideline including local realignment, removal of PCR duplicates, and base quality recalibration. Single nucleotide variants and small insertion-deletion variants (indels) were called by GATK's HaplotypeCaller, and variants for each sample were consolidated with GenotypeGVCFs. The combined variant call format file (VCF) was then annotated with ANNOVAR (
http://annovar.openbioinformatics.org/en/latest/, available in the public domain) and filtered for quality with VCFtools (
http://vcftools.sourceforge.net/, available in the public domain). Variants were annotated with their frequency in the European population using the National Heart, Lung, and Blood Institute (NHLBI), Exome Sequencing Project's Exome Variant Server (ESP), Exome Aggregation Consortium (ExAC), and 1000 Genomes databases using ANNOVAR. Variants were annotated for region and exonic function by reference to refSeq and annotated for predicted impact by reference to PolyPhen-2 (
http://genetics.bwh.harvard.edu/pph2/, available in the public domain) (benign, possibly damaging, probably damaging) and SIFT (
http://sift.jcvi.org/, provided by the J. Craig Venter Institute, Rockville, MD, USA) (tolerated or damaging) using ANNOVAR. Variants with genotype quality (GQ) <30, depth (DP) <8, or Phred-scaled likelihood of reference genotypes (PL) <99 were excluded. Variants in
IFN-γ, IL-1β, IL-2, IL-4, IL-6, IL-8, IL-10, IL-12(p70), IL-13, TNF-α, TLR-4, and
HLA-DQ and related genes were prioritized for relevance based on previous studies. Additional hard filtering was applied as an alternative to VQSR using values suggested by GATK best practices.
38 Filters included the following: quality by depth (QD) <2.0, Fisher strand (FS) >60.0, root mean square of the mapping quality (MQ) <40.0, Mapping quality rank sum test (MQRankSum) <−12.5, read position rank sum test (ReadPosRankSum) <−8.0, strand odds ratio (SOR) >3.0. Filters were applied manually in R version 3.4.2.
39