Abstract
Purpose: :
Genomewide association studies have primarily concentrated on analyzing common variants in the search for genes which affect risk for complex traits such as refractive error. Rarer SNPs have typically not been included on genotyping arrays and SNPs which did have low minor allele frequencies (MAF) are usually filtered out at the quality control stages because of the potential for distortion of the most commonly used test statistics and subsequent inflation of type I error. The effort to characterize rarer variation in the 1000 Genomes Project has identified a number of SNPs of small MAF which have been incorporated in the Illumina genomewide 2.5M genotyping chip. At the same time, recent advances in statistical methods for analyzing rarer SNPs have been developed in response to the technological advances in genome sequencing. Here we apply two region-based methods for analyzing SNPs with MAFs less than 5% to a genome-wide association study (GWAS) of refractive error (RE).
Methods: :
We used the Illumina 2.5M SNP genotyping array to genotype 2000 selected controls from the AREDS cohort as part of a larger GWAS of RE. We compared two methods which both rely on collapsing rarer variants within defined regions: the Sequence Kernel Association Test (SKAT) and Tiled Regression (TRAP). Thirty kb regions were used for SKAT and regions bounded by recombination hotspots for TRAP.
Results: :
For SKAT, the type I error rates were well controlled. No signals were genomewide significant but some of the most strongly associated regions (ex. chr6, p=2.1x10-7; chr20, p=4x10-6; chr18, p=5.5x10-6) contained potentially interesting genes. TRAP analyses are ongoing and will be compared with the SKAT analyses when complete.
Conclusions: :
As the paradigm for analyzing genetic data shifts from genotyping to sequencing, the standards for which analysis methods to use have yet to be settled. We analyzed rare variants for effect on RE using two methods that leverage both the collapsing rare variants approach and a region-based approach to the data. This allows for very rare alleles to be analyzed without increasing type I error and also takes account of the directionality of the effect. Several candidate regions were discovered and future analyses including additional data should increase power. We think this will be a valuable approach to the analysis of many complex eye disorders, where GWAS searches for common alleles of small effect have not been wholly successful in identifying all causes of the heritable component of the trait.
Keywords: clinical (human) or epidemiologic studies: biostatistics/epidemiology methodology • gene mapping • myopia