Genetic studies are conducted to identify the hereditary nature of diseases and traits, primarily relying on the comparison of genetic variation between individuals with differential expression for the trait of interest. A typical genome-wide association study (GWAS) surveys between 500,000 and 1,000,000 single-nucleotide polymorphisms (SNPs) across the entire human genome simultaneously, and such genome-wide designs have replaced candidate gene studies as the preferred strategy to study the genetic etiology of complex human traits,
28,29 including eye disorders.
30 –39 Cochran-Armitage trend test, χ
2 test and logistic regression model are largely used in the case–control design to study the overrepresentation of the mutated allele in cases versus controls.
40 In family-based studies, we measure the excess transmission of any allele from heterozygous parents to affected offsprings under the condition of Mendel's law.
41 Furthermore, the incorporation of longitudinal information such as modeling time to event and repeated measurements will add merit to GWAS.
42
Testing multiple hypotheses simultaneously to draw correct statistical inference is the most challenging aspect in GWAS. It is now common to assay a million variants in a GWAS, and this effectively constitutes a million hypothesis tests. A conventional significance threshold of 5% is thus expected to artifactually identify 5000 markers that are “correlated” to the trait. To address this problem with multiple testing, geneticists have adopted a stringent statistical significance level of 5 × 10
−8, commonly defined as genome-wide significance, the benchmark for evaluating the fidelity of the association signal at each marker.
40 Replication is considered the gold standard in GWAS publications.
43 The identification of candidate genetic loci for replication is mainly driven by the level of statistical evidence from single-marker association tests (either the
P value or the Bayes factor).
40,44 More advanced approaches, for example, pathway-based analyses and epistasis tests, have also been proposed to prioritize genetic markers for further downstream functional evaluation. These analytic strategies have been covered comprehensively in previous reviews.
45,46
In gene mapping, phenotypes are usually classified into two broad types: qualitative (or binary) and quantitative (or continuous) traits. Dichotomous traits have been featured in GWAS for age-related macular degeneration (AMD),
34,35 primary open-angle glaucoma (POAG),
31,39 cataract,
37 and high myopia.
36,38 The affected individuals are usually classified on the basis of diagnosis from the worse eye or both eyes, whereas controls exhibit no sign of syndrome for both eyes. Although assessing the binary outcome is more directly relevant to clinical application, quantitative traits (endophenotypes or intermediate traits) underlying diseases are also valuable in the dissection of the genetic architecture, as they take the full-spectrum measures into account. For instance, central corneal thickness (CCT) and cup-to-disc ratio (CDR) are presented as quantitative endophenotypes of open-angle glaucoma (PORG).
47 Mapping genes for CCT
48 –50 and CDR
51,52 in GWAS would shed light on the joint genetic etiology of PORG.
Often, the primary interest in ophthalmic genetic studies for quantitative trait is to locate shared genetic loci that exert effects on both eyes,
53 –55 as the physiological mechanism underlying intereye difference of phenotypic abnormalities remains elusive and inadequately understood. Therefore, for quantitative traits collected from both eyes, an immediate question is whether the analyses should be performed on data from one eye or both eyes. In seven GWAS papers on eye-related QTL that have been published to date (
http://www.genome.gov/gwastudies), the analytic strategies varied from the use of the right eye,
49,50,52 to a randomly chosen eye,
51 to the averaged measurement from both eyes.
32,33,48 Conducting analysis to one eye alone is a simple approach to avoid statistical model complexity. However, using partial data of one eye only may be statistical insufficient. Averaging ocular measurements between both eyes has been suggested to yield higher heterogeneity estimates than using information from one eye only and therefore tends to have more power in genetic studies.
56 Using averaged ocular measurements therefore has been the convention in linkage study for quantitative trait in the myopia genetics research community.
57 –60 In a few scenarios in which the traits may be moderately or weakly correlated between the two eyes, however,
1 neither the use of data from one eye nor an average from both is appropriate, because of the negligence of phenotypic dissimilarity.
A wide array of statistical approaches has emerged recently for the detection of the pleiotropic genetic factors contributing to multiple correlated traits, which could also be applied to paired-eye data (
Table 4). Simultaneous consideration of all correlated phenotypes is shown to be statistically powered to exploit the pleiotropic genetic effects over the univariate analysis.
61 –64 The first approach is to combine dependent test statistics or estimators from the univariate analyses for a global assessment on association.
61,65 –67 In brief, GWAS tests are conducted for the two eyes separately. The two test statistics from both eyes (for example,
z scores) are combined subsequently in a linear form weighted by the covariance matrix estimates.
61,67 Correcting for twice the number of markers is not relevant here, since only one global test is performed for each marker, using the combined statistics. This simple approach does not rely on a complicated model assumption. The second approach is to transform multiple traits to an optimal single phenotype with enhanced heritability, and one such example is principle component analysis.
62,68 This dimension-reduction technique involves intensive computation; thus, the application in paired-eye data may not be straightforward. The third one is model-based joint analysis of bivariate traits, including GEE,
63,69 –71 mixed-effects,
64,72 and tree-based regression,
73 et cetera. Of these, the GEE model is the most statistically efficient in performing bivariate association tests.
63,71 To date, few statistical software programs incorporating model-based joint analyses on bivariate traits are available
74 ; much more effort should be devoted to this area.
Accumulated evidence suggests that most of the GWASs are underpowered, especially for the common variants with small-effect sizes and the associated SNPs generally explain little genetic variation.
75 Meta-analysis provides a robust approach to enhance statistical power and effective sample size by pooling evidence from multiple independent association studies.
76,77 Application of meta-analysis in ophthalmology has become a standard practice to identify genetic polymorphisms that are associated with eye disorders.
32,33,49 –52 If the individual GWAS is conducted with different genotyping platforms, the meta-analysis strategy could use only a small subset of overlapping markers. One way to address this problem is imputation-based meta-analysis. It provides a powerful framework for the assessment of the complete array of genetic variants (most of which are untyped). Step-by-step guidelines and techniques for performing imputation-based genome-wide meta-analysis were reviewed by de Bakker et al.
77 In meta-analysis, using homogeneous populations with the similar genetic background, phenotype definition, and sample ascertainment will increase the likelihood of identifying the genuine genetic association.
78 In the presence of heterogeneity across different studies, carefully examining the potential factors that cause heterogeneity is crucial to enhance the credibility of the combined evidence.