In the set of patients with PCG included in this study, 8% had an affected sibling, whereas the remaining were sporadic. In all cases, parents were genotyped. An observed nucleotide variant in a patient was scored as a mutation only when (1) it segregated with the disease phenotype, (2) it was absent in all normal control subjects, and (3) there was a significant difference in the minor allele frequencies between the patients and control subjects. In addition, the wild-type residue was conserved across different species.
Mutations in
CYP1B1 accounted for 44.93% (62/138) of all cases of PCG. The R368H mutation was the most common, accounting for 48.38% of mutations. Further details are provided in
Table 1 . Seventeen pathogenic mutations in
CYP1B1 were observed, of which nine were novel.
10 15 19 20 These novel mutations included three frameshift mutations resulting from a 23-bp deletion (g.3905del23bp) and a stop codon due to an insertion of adenine (A) at 30 bp (g.3835) in exon II and a 2-bp deletion (g.7900-7901delCG) in exon III. Five novel missense mutations—A115P, M132R, Q144P, P193L, and S239R—-were noted in exon II and one (G466D) in exon III , whose residue is a part of the signature sequence (NH
2-FXXGXXXCXG-COOH) and is present in all heme-binding cytochromes. The spectrum of
CYP1B1 mutations is larger in the Indian population,
10 than among Saudi Arabians,
3 Slovakian Roms,
2 and Brazilians.
11 Marked phenotypic heterogeneity in clinical severity had been observed for these mutations in our earlier studies.
10 15 19
Table 2provides the estimated frequency distributions of haplotypes among patients with PCG, classified by the presence or absence of
CYP1B1 mutations, and among unaffected control subjects. These frequency distributions are significantly different (χ
2 = 64.86, 8
df;
P < 0.00001; infrequent haplotypes were pooled to avoid vagaries of small sample sizes). The haplotype frequency distributions among patients with PCG with
CYP1B1 mutations (
CYP1B1(+) subgroup) and normal subjects were unimodal, whereas in patients without
CYP1B1 mutations (
CYP1B1(−) subgroup) it was bimodal. Further, the modal haplotypes were different for the
CYP1B1(+) subgroup (C-G-G-T-A) and control subjects (G-T-C-C-A). However, the haplotype frequency distributions of the
CYP1B1(−) subgroup and control subjects were strikingly similar
(Table 2) .
There were many striking differences in frequencies of some haplotypes among the three groups. The C-G-G-T-A haplotype occurred with a high frequency (61.6%) in the CYP1B1(+) patient subgroup, which is roughly four times higher than that in the control subjects (17.8%; P exact, the probability for the Fisher exact test < 0.00001) and three times higher than in the CYP1B1(−) patient subgroup (21.8%; P exact < 0.00001). The difference in frequencies of this haplotype between the CYP1B1(−) subgroup and control subjects is not statistically significant (P exact = 0.058). The G-T-C-C-A, which was the modal haplotype (42.3%) among the normal control subjects, showed a decline in frequency (31.9%) in the CYP1B1(−) subgroup of patients, and a further decline (14.9%) in the CYP1B1(+) subgroup. These decreases were statistically significant (both P exact < 0.001). The frequency of the C-G-C-C-A haplotype among normal subjects and the CYP1B1(−) patient subgroup is virtually the same, but is significantly lower (P exact < 0.00001) in the CYP1B1(+) subgroup.
Table 3gives the distribution of prevalent
CYP1B1 mutations on the background of the four major haplotypes in patients with PCG worldwide. (Complete lists of prevalent and minor
CYP1B1-associated PCG mutations on the background of these haplotypes are provided in
Supplementary Table S2.) As is evident from the table, most of the common mutations are clustered on the background of the C-G-G-T-A haplotype. It is striking that specific mutations are generally present on specific haplotype backgrounds, irrespective of geographical location. For example, a major proportion of the R368H mutation that is predominant in India occurs on the C-G-G-T-A background, and it is this same background on which this mutation is also found in diverse geographical areas such as Saudi Arabia
3 and Brazil.
11 Similarly, the E387K mutation, which is the only mutation present among the Slovakian Roms
2 appears on the G-T-C-C-A background and is also found on this same background in United States
13 and Brazil.
11 Similar features also apply to other mutations, such as 4340delG and R390C. However, it is interesting that although the
CYP1B1-associated PCG mutation E387K occurs predominantly on the G-T-C-C-A haplotype in many regions of the world, this haplotype is more strongly associated with the
CYP1B1(−) patient subgroup in India and also occurs in high frequency among normal individuals
(Table 2) . It may also be noted that unlike in other populations, a relatively minor proportion of R368H, M132R, and E229K mutations occur on multiple haplotype backgrounds among Indian patients with PCG
(Table 3) . Because it is unlikely that a specific haplotype background favors the recurrence of a specific mutation, this feature is most likely due to founder effects.