January 2002
Volume 43, Issue 1
Free
Lens  |   January 2002
Unexpected Variation in Unique Features of the Lens-Specific Type I Cytokeratin CP49
Author Affiliations
  • Peter A. Binkley
    From the Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California.
  • John Hess
    From the Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California.
  • Jodi Casselman
    From the Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California.
  • Paul FitzGerald
    From the Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California.
Investigative Ophthalmology & Visual Science January 2002, Vol.43, 225-235. doi:
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Peter A. Binkley, John Hess, Jodi Casselman, Paul FitzGerald; Unexpected Variation in Unique Features of the Lens-Specific Type I Cytokeratin CP49. Invest. Ophthalmol. Vis. Sci. 2002;43(1):225-235.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

purpose. CP49 is a fiber cell–specific type I cytokeratin, but its function as part of the fiber cell–beaded filament remains unknown. To provide a rational basis for mutational studies that would contribute to an elucidation of function, the study was designed to define elements of CP49s that are highly conserved, discriminate conserved features from species-specific variations, and identify where CP49s have diverged from consensus type I features in their adaptation to selective pressures in the lens.

methods. The primary sequence and gene structure of CP49 from a third vertebrate order was determined from a combination of cDNA and genomic sequencing. Protein product was characterized by SDS-PAGE and Western blot analysis. Consensus features and phylogenetic relationships were identified by multiple alignment. Coiled-coil analysis was conducted to define central rod domains.

results. Trout CP49 is unique among CP49s in having a 39-amino-acid tail domain and shows both unique sequence and allelic variation at the LNDR motif. Comparison of consensus sequences identified unprecedented divergence between CP49s and other type I cytokeratins, including a shortened central rod domain that is conserved among CP49s, but distinct from type I cytokeratins.

conclusions. The considerable differences that have emerged between the consensus features of the type I cytokeratins and the CP49s suggest that the beaded filament serves a significantly different function from intermediate filaments in other epithelia and that type I cytokeratins may have limited utility as a model for studies on lens beaded filaments. These differences, in concert with consensus features identified among CP49s, suggest sites that are probably critical to CP49 function in the lens fiber cell.

Differentiation of lens epithelial cells into fiber cells includes the emergence of a structurally unique cytoskeletal element referred to as the beaded filament (BF). 1 Immunocytochemical studies have identified at least two proteins, CP49 and CP115 (filensin 2 ) as constituents of this filament. 3 4 5 6 7 Homologues of these proteins have been identified by immunochemical approaches in at least five vertebrate orders, but expression appears to be restricted to the differentiating lens fiber cells. 3 4 8 9 10 This suggests a function that must be important specifically in fiber cell biology, a suggestion confirmed by evidence that indicates that point mutations in human CP49 are cataractogenic. 11 12 However, BF function remains to be defined. 
Determination of primary sequence and gene structure has established that both BF proteins are part of the intermediate filament (IF) family of proteins. 2 13 14 15 16 17 18 19 20 Although the family of cytoplasmic IF proteins vary considerably in both size and primary sequence, they have historically been unified into a gene family on the basis of several properties: conservation of gene structure; a common domain structure consisting of variable head and tail domains flanking a central rod domain that is conserved in size and in predicted subdomain structure; a low level of overall sequence identity, including strong conservation at two short motifs found at the beginning and ends of the central rod domain; and the ability to assemble into10-nm IFs. 
Although the BF proteins are clearly a part of the IF family, they have also been noteworthy for the degree to which they do not have the features that have otherwise been highly conserved among all cytoplasmic IF proteins—features that have been hypothesized or experimentally established to be critical for assembly. The BF proteins that have been sequenced thus far have been limited to mammalian and avian examples. To widen the scope of comparison among CP49s and to help distinguish features that are conserved from those that are species-specific variants, we sequenced a CP49 from a third vertebrate order, fish. The data reported herein establish that some of the features of avian-mammalian CP49s that are noteworthy for their variance from other type I cytokeratins are not necessarily conserved among all CP49s. Multiple alignment of the CP49s defines features and residues that have been strongly conserved and highlights the unusual degree to which the CP49s have diverged from the remainder of the type I cytokeratins. Such information identifies residues and properties that are conserved, presumably because of functional importance, and provides a basis for identifying features that adapt the type I cytokeratin CP49 to its function in the lens fiber cell. 
Materials and Methods
Trout
Protocols for the use of trout were approved by the University of California Davis Animal Use and Care Committee and are in accordance with the ARVO Statement for the Use of Animals in Ophthalmic and Vision Research. Young rainbow trout (2–6 in. in length) were collected from the university’s fisheries facility. For both protein and RNA isolation, lenses were decapsulated. RNA was isolated by extraction with guanidine isothiocyanate (GITC). 21 cDNA was created by reverse transcription of total RNA using commercially available primers (Superscript II and oligo dT; GibcoBRL, Grand Island, NY). 
Determination of trout CP49 sequence was initiated by PCR, using degenerate primers derived by alignment of CP49s and identification of conserved regions: 5′-TAYGARAAYGARCARCCNTT-3′ and 5′-YTCNATNTCRTGCCARTG-3′. A 500-bp product resulted from PCR with these primers, which permitted amplification by 3′-rapid amplification of cDNA ends (RACE), using gene-specific and oligo dT primers, yielding an additional 460 bp. 5′-RACE was conducted also using gene-specific primers and oligo dC, after deoxyguanosine triphosphate (dGTP) tailing of reverse-transcribed trout lens RNA. 
Trout genomic DNA was isolated from trout liver with a kit (DNeasy; Qiagen, Chatsworth, CA). Ambiguity in the cDNA sequence for nucleotide 296 (Fig. 1) was resolved by PCR amplification of genomic DNA isolated from six different individuals. 
Intron Identification
Intron presence and approximate size were determined by PCR, using primers flanking intron sites that had been identified in the human CP49 gene. 14 Intron locations are denoted in Figure 1 by arrows and an identifying letter. Letters used to identify introns are in accordance with previously used designations. 14  
SDS-PAGE and Immunoblot Analysis
Decapsulated trout lenses were homogenized in 50 mM Tris, 5 mM EDTA, and a cocktail of protease inhibitors (Complete Mini; Roche Biochemicals, Indianapolis, IN) and fractionated into buffer-soluble and buffer-insoluble fractions by centrifugation at 50,000g for 30 minutes. The buffer-insoluble fraction was solubilized for SDS-PAGE, resolved on 12.5% SDS-polyacrylamide gels, and transferred to polyvinylidene fluoride (PVDF) membrane (Immobilon P; Millipore, Bedford, MA). Immunoblots were probed with rabbit antiserum raised against recombinant mouse CP49, at 1:2000 dilution. Primary antibody was visualized with alkaline phosphatase-conjugated goat anti-rabbit antibody, and nitroblue tetrazolium chloride (NBT)-5-bromo-4-chloro-3-indoyl (BCIP) substrate-chromogen. 
Sequence Analysis
Multiple alignments were conducted at http://www.toulouse.inra.fr/multalin.html (provided in the public domain by Institut National de la Recherche Agronomique, Tolouse, France), using the method of Corpet et al. 22 Paircoil analysis was conducted at http://nightingale.lcs.mit.edu/cgi-bin/score (provided in the public domain by Massachusetts Institute of Technology, Cambridge, MA), using the method of Berger et al. 23 To permit direct alignment and comparison of the resultant Paircoil graphs, all Paircoil analyses were conducted on uniformly sized fragments of IF proteins. Fragments consisted of the central rod domain, plus 15 residues of the head domain (counting from the conserved L of the LNDR motif), and 4 residues of the tail domain (8 residues past the conserved Y of the TYRKLLEGE motif). These residues are identified in Figure 3 by the symbol ▴ below the residue. 
Results
Trout CP49 Nucleotide and Amino Acid Sequence
The complete nucleotide and amino acid sequence of trout CP49 is presented in Figure 1 . Trout CP49 is composed of 439 amino acids, has an isoelectric point (pI) of 5.15, and a predicted molecular weight of 48,883.4. Ambiguity in the signal for the amino acid at nucleotide 296 required resequencing of both cDNA and genomic DNA in this region. This led to the determination that the population of trout under study exhibited allelic heterogeneity at this site, resulting in triplet codons of AGC or AAC, encoding for S or N. Some individuals were homozygous for one or the other allele, whereas other individuals were heterozygous, possessing both alleles. No opacification was noted in any of the lenses used in the study. 
The deduced trout CP49 amino acid sequence was used to probe the protein databases and showed the highest degree of similarity to the existing human, bovine, murine, and chicken CP49s, with sequence identities ranging from 49% to 52%. This level of identity is typical of that seen between IF homologues from orders this distant over the evolutionary spectrum. The best non-CP49 match was the type I cytokeratin 18 at 32% identity, followed by several dozen different type I cytokeratins at comparable levels of identity. These results are similar to those achieved when other CP49s were used to probe protein sequence databases. 14  
Trout CP49 Gene Structure
Gene structure is generally well conserved among IF proteins, particularly within the segment of gene that encodes for the rod domain. 24 25 26 27 28 The number and location of introns is strongly conserved within an IF class, mirroring the subgrouping of IF proteins that is achieved through primary sequence analysis. Thus, IF type can be determined by gene structure as well as by primary sequence similarity. 29 Figure 2 shows the nucleotide sequence at the intron–exon boundaries for introns that we have identified in the trout CP49 gene, along with the approximate size of the introns, estimated by PCR-agarose gel electrophoresis. We show that introns C, E, F, and G, which are common to type I, II, and III IF proteins, are present in trout CP49. These introns are identical in location and in the phase of the triplet codon that they interrupt, with those shown for human CP49, and for type I cytokeratins in general. Intron H, which is generally present in type I, II, and III IF genes, and in the human CP49, is absent from trout. Notably, intron H is also absent from the chicken CP49 gene, 30 as well as the type I cytokeratin K19 gene. 31  
Confirmation of intron B in trout genomic DNA was problematic. In mouse and human CP49 this intron is in excess of 23 kb and is resistant to PCR amplification. Several PCR primer sets that flanked the predicted site of intron B worked well on trout lens cDNA but failed to produce a product when trout genomic DNA was used as a template. This failure is indirect evidence that is consistent with the presence of intron B in the trout CP49 gene. 
We identified an intron in the tail domain of the trout CP49 gene as well (located in Fig. 1 at nucleotide 1215). 
CP49 Consensus Features
To identify CP49 consensus features, we conducted a multiple alignment of all CP49s sequenced to date (Fig. 3) . The most striking difference between trout and other CP49s was the presence of a 39-amino-acid tail domain in trout CP49. The presence of this tail domain leads to an increase in molecular weight of approximately 4.2 kDa and a corresponding shift in mobility in SDS-PAGE/immunoblot analysis. This shift can be seen in Figure 4 , which presents an immunoblot of trout lens (lane A) and bovine lens (lane B) probed with rabbit antiserum raised against recombinant mouse CP49. This blot also confirms the strong immunologic relationship among the trout, bovine, and murine CP49s. 
Neither the mammalian nor avian CP49s have tail domains, 2 14 15 17 a feature that has been one of the more striking differences between CP49s and the rest of the IF family. The trout CP49 tail domain is comparable in size to those commonly seen among non-CP49 type I cytokeratins. When the trout CP49 tail sequence was used to probe the protein databases, no similarity was seen to tail domains in any IF protein. Thus, the trout CP49 tail sequence appears to be unique among IF proteins. 
Two short motifs, located at the beginning and end of the rod domain, are among the best conserved regions of IF proteins. These motifs are unusually sensitive to mutations and are considered critical to IF assembly. 32 33 34 35 36 37 38 The consensus sequences for these two motifs in type I cytokeratins are LNDR and TYRRLLEGE. The former of these two is by far the best conserved. The CP49 homologues to these motifs are highlighted in Figure 3 (starting at residues 154 and 460, respectively). Alignment of the LNDR region from 10 human type I cytokeratins and a sampling of cytokeratins from several vertebrate orders establishes that the LNDR sequence is 100% conserved among the non-CP49 type I cytokeratins (though exceptions are likely to be found). In contrast, the CP49s show variation from the type I consensus at three of four residues in this motif. Moreover, this motif is not well conserved, even among the CP49s, with three permutations identified in the five species sequenced to date. Only the first (L) and fourth (C) residues are conserved among CP49s in these sequences. We noted allelic variations in this motif even within the population of trout we examined, with both LNSC and LNNC identified. 
Also indicated in Figure 3 are the sites where mutations in human CP49 have been implicated as cataractogenic (denoted in Fig. 3 as Image not available at amino acid 282, and Image not available at amino acid 343). In both cases the residue is 100% conserved among the CP49s. 
Domain Structure
Paircoil analysis is a means of predicting whether a given primary sequence is likely to engage in the formation of a coiled-coil dimer. The coiled-coil is a dimer formed by the pairing of two stretches ofα helix. The dimer, in turn, exhibits a gentle supercoiling or coiling of the coils. The initial dimerization appears to require the presence of predominantly hydrophobic residues along one edge of eachα helix (at the 1 and 4 positions of the heptads that comprise theα helix). The algorithm is based on empiric data derived from the protein crystal database, where such dimers have been demonstrated. Such analysis predicts the presence of a central rod domain in all cytoplasmic IF proteins. This domain is predicted to be rich in α helical regions (coils) characterized by the heptad repeat pattern that predicts coiled-coil interactions. The coils are separated by short regions (linkers) that are not predicted to be α helical. The overall size of the rod domain has been strongly conserved among IF proteins, a feature considered critical to the initial pairing of IF proteins as a coiled-coil dimer. 39 We therefore used Paircoil analysis to characterize the predicted rod domain in the CP49s and compared this with the prediction generated for several type I cytokeratins. 
Figure 5 presents a graphic representation of eight human type I cytokeratins, and five CP49s. The probability of coiled-coil formation is scored on the y-axis, from 0 to 1, with 1 the maximal probability and 0.5 the cutoff. The amino acid residue is numbered on the x-axis. To permit visual comparison of overall rod domain size and the distribution of coil and noncoil subdomains, we analyzed the same sized fragment from each protein, a fragment that includes the rod domain, plus a small amount of flanking sequence (described in the Materials and Methods section). 
Figures 5a 5b 5c 5d 5e 5f 5g are human type I cytokeratins. Above Figure 5a is a schematic representation of the IF rod domain, with coil domains 1a, 1b, and 2 and the linkers that connect them shown. The conservation of domain features in type I cytokeratins is evident by comparison of the Paircoil plots. Coil 1a is conserved in size (number of amino acids on the x-axis) and shows a strength that falls between 0.5 and 1.0. Coil 1b is uniform in size and is generally maximal in strength. Coil 2 starts at maximal, but weakens toward the COOH terminus. 
Figures 5h 5i 5j 5k 5l are five CP49s. These, too, show features that emerged as generally common to the CP49s: The first region to exceed the 0.5 default cutoff in CP49s does not occur until a point that is equivalent to the second half of rod 1b in the type I cytokeratins. Thus, CP49s have no rod domain 1a (exhibiting a conserved lower probability). Rod domain 1b starts later and is overall shorter than its counterpart in type I cytokeratins. 
Rod domain 2 in the CP49s begins and ends at a site equivalent to that for rod 2 in the cytokeratins, but exhibits different features: The strength of the signal in CP49 rod 2 is generally lower, with human as an exception; and the CP49 rod 2 is subdivided by a gap in signal strength. This gap occurs where the heptad repeat pattern“ stutters” (shifts phase). This stutter is demarcated by asterisks over residues 417-426 in Figure 3 . The presence of a stutter in the heptad repeat pattern is a well-conserved feature of rod domain 2 in IF proteins and aligns exactly with the interruption noted here. Thus, CP49s appear to have conserved this feature of IF proteins. 
Although variation exists in the Paircoil profiles of individual proteins, particularly in the more distant type I cytokeratins, the Paircoil analysis suggests conservation of patterns, regardless of where the cutoff value may be set. The most notable difference between CP49s and the other type I cytokeratins is that the CP49s all predict a shorter central rod domain. This observation is particularly interesting, because similar analysis of CP49’s assembly partner CP115 also showed a shorter rod domain. 2 15 This analysis, theoretical until structural data can be generated, suggests that the CP49s have diverged from the majority of type I cytokeratins in secondary structure as well. 
It has been postulated that the large number of IF proteins found in a given species have been derived from a single ancestral gene by divergent evolution, 24 an assumption based on the gene structure and sequence conservation seen among most IFs. Insight can be gained into the IF family tree by establishing percent accepted mutation (PAM) distances between IF proteins within a species as though they were homologues from different species. This analysis integrates the degree of sequence divergence between a given protein and all others in the comparison group and sums the process for each of the proteins within the group. Figure 6 shows such an analysis of 25 human IF proteins, including CP49 and CP115. The historic clustering of cytoplasmic IF proteins into types I through IV based on primary sequence of the central rod domain, tissue distribution, and gene structure is reiterated in such an analysis. The type I cytokeratins (K9-20), type II cytokeratins (K1-8), type III IF proteins (GFAP, DES, VIM, PER), and type IV neurofilament proteins (NFH, NFM, NFL) form distinct clusters. The two BF proteins, CP49 and CP115 (filensin), stand out as the most distant members of the human IF family. 
CP49s Versus Type I Cytokeratins
Figure 3 shows alignments of the five CP49s that have been sequenced to date. Below this, the individual residues that are 100% conserved among CP49s are depicted as CP49 conserved. Also shown, as type I conserved, are residues that were 100% conserved among 10 human type I cytokeratins, as well as examples of type I cytokeratins from mammalian, amphibian, fish, and avian species. Forty-three residues, all in the rod domain, are 100% conserved in this population of type I cytokeratins. However, the CP49s show identity at only 24 of these 43 residues. 
Discussion
The determination that the BF proteins CP49 and CP115 are members of the IF family represented a transition point for IF biology. At the structural level, BF proteins were the first cytoplasmic IF proteins localized to a structure that was not a classic 8- to 11-nm IF. 3 4 5 6 More thorough examination of both BF proteins has revealed multiple departures from the features that were otherwise well-conserved among IF proteins. Thus, to accommodate the BF proteins, boundaries of the IF family had to be extended to proteins and structures that did not conform to many of the “rules” that had thus far bound IF proteins together as a family. 
In this report, we extend the analysis of the CP49s beyond the mammalian and avian sequences that have been reported thus far. The data presented herein established that some of the features hypothesized to be unique to the CP49s are conserved in a species from a third vertebrate order. This suggests that such features have resulted from strong selective pressure and are important to biological function in the lens fiber cells. Conversely, some of the features considered unique to the CP49s are less well conserved, being absent from the trout CP49. By expanding the database of CP49 sequence information, we contribute to identification of those residues, motifs, and properties that have been retained across a wider phylogenetic spectrum and help discriminate these from variations that may be species specific. 
One of the more interesting differences between the CP49s and other type I cytokeratins occurs at the “LNDR” motif found near the beginning of the central rod domain in all IF proteins. This motif is extremely well-conserved, not only among the several type I cytokeratins in humans, but also in type I cytokeratins from several vertebrate orders. The importance suggested by its strong conservation is confirmed by the sensitivity of this motif to disease-causing mutations. A large body of elegant work has established that a mutation in the LNDR motif that causes an R→C substitution at the fourth residue, for example, is the cause of one form of the human skin disorder epidermolysis bullosa simplex (EBS), a skin-blistering disease. 37 38 40 This substitution in some way compromises the capacity of epidermal IFs to provide the necessary resistance to mechanical trauma, resulting in a separation of epidermal layers and blistering. This has been confirmed by experimental introduction of these same mutations in mice. Yet in the CP49s, the presence of a C at that same fourth position is a conserved feature of all the CP49s thus far sequenced, including the trout. Further, this motif as a whole, shows a relatively high degree of variability, even within the CP49s, with three permutations reported in the five species that have been sequenced. We note that allelic variation occurs even within the population of trout that were included in this study. The variability that is demonstrated at this motif suggests that it experiences a weaker selective pressure than its counterpart in the other type I cytokeratins, implying the assumption of a less important role in the biology of this protein. 
It is interesting to speculate that the variability seen in the CP49 homologue of the LNDR motif is related to the changes seen in the predicted central rod domain of CP49. Paircoil analysis shown in Figure 5 suggests not only that the central rod domain of CP49 begins well after the LNDR motif but is also shorter in overall size than that seen in other IF proteins. If the LNDR motif represents a start point for anchoring the formation of a coiled-coil dimer, then the shifting of this start point farther downstream in CP49 may lower the selective pressure on the LNDR homologue in the CP49s, resulting in the emergence of sequence variability. This leads to the hypothesis that a CP49 should then show a high degree of sequence conservation at the beginning of its foreshortened rod domain. In fact, this is the case. One of the longest runs of absolutely conserved amino acids among the CP49s occurs at the very beginning of what is predicted by Paircoil analysis to be the rod domain (amino acids 239-250 in CP49 Conserved, Fig. 3 ). 
The difference between CP49s and the other type I cytokeratins at the LNDR motif is of particular interest because of the established importance of the motif in human disease. However, we noted similar variations between CP49s and the other residues that are highly conserved among type I cytokeratins. To identify residues that are likely to be critical to type I cytokeratins we conducted multiple alignment of all human type I cytokeratins, plus representatives of type I cytokeratins from several different vertebrate orders (Fig. 3) . We identified 43 residues that were 100% conserved in this population, implying functional importance. This suggests that any newly identified type I cytokeratin would have a very high probability of exhibiting the same residues at all or most of these sites. The CP49s are identical at only 24 of the 43 residues, further reinforcing the size of the gap that exists between the CP49s and the remainder of the type I cytokeratins. 
The mammalian and avian CP49s were noteworthy for the absence of the tail domain, a feature that looked to be unique among IF proteins and a conserved feature of the CP49s. 14 17 This would suggest that a tail domain is either not important in the function of the CP49 or perhaps even a detriment to it. However, the trout CP49 exhibits a 39-amino-acid tail, comparable in size, but not sequence, to that that is typical of type I cytokeratins. 
In determining the location of introns for the trout CP49 gene we noted that the trout CP49 gene does not have the intron H (Fig. 2) . This intron is commonly found at the very end of the rod domain among type I, II, and III IF proteins and in human CP49 as well. The absence of intron H in the trout CP49 gene raises the possibility that the trout CP49 acquired a tail domain through a mutation that eliminated a splice site. In the absence of selective pressure against the tail domain, this feature may have persisted. It is worth noting that the chicken CP49 also has no intron H, 30 but also that the chicken CP49 has no tail domain. 
Alignments of the CP49s shows a strong run of absolutely conserved sequence in the head domain of CP49s (RRALGISSVFLQGLRS, starting at aa residue 102 in Fig. 3 ). This region stands out not only because this sequence is so well conserved in CP49s, but also because there is no comparably conserved region in the type I cytokeratins, suggesting the assumption of importance of this region in the head domain of CP49. 
Among the members of an IF type, such as the type I cytokeratins, the level of sequence identity, the conservation of rod domain gene structure, and the similarity in properties is generally quite strong. The CP49s are clearly an exception to this tendency. CP49s do, in fact, have a rod domain gene structure that is similar to that of most type I cytokeratins 14 and that is distinct from all other IF types. They are thus clearly type I cytokeratins. However, beyond this, the relationship to the type I cytokeratins weakens considerably, suggesting that the CP49s have experienced a dramatically different set of selective pressures, resulting in considerable change in both primary sequence and secondary structure. CP49s are closer in primary sequence identity to the type I cytokeratins than to other IF types, but only marginally so. Similarly, the motifs and residues that are either absolutely or at least extremely well conserved among the type I cytokeratins have undergone extensive divergence in the CP49s. The same contrast emerges for features such as secondary structure. The ultimate question, of course, is how these differences adapt the CP49 function to the unique biology of the lens fiber cell. To answer this question will undoubtedly require loss-of-function–gain-of-function mutational studies that target areas of the CP49s identified as strongly conserved through analysis, such as that presented in this report. 
It may be predicted that the production of a BF instead of an intermediate filament would require substantial changes in primary and secondary structure. Such a hypothesis is appealing because it rationalizes the existence of two distinct categories of IF: the classic 8- to 11-nm IFs and the BFs. In such a case, the features and sequence motifs that are conserved among the IF proteins would be relevant to assembly into 10-nm IFs, whereas those divergences cataloged in the BF proteins would explain their alternative assembly outcome. However, Goulielmos et al. 41 and Carter et al. 42 and have reported that the BF proteins assemble in vitro into classic 10-nm filaments. Thus, the rather dramatic changes seen in the BF proteins do not eliminate their capacity to assemble into 10-nm filaments. This is a striking observation given the extreme degree to which the CP49, as well as its assembly partner CP115, have varied from the IF consensus. 
That CP49 and CP115 form a heteropolymer only adds to the complexity of the story. In this vein, it is worth noting that one of CP115’s most unusual features has been a shortened central rod domain. 2 15 Multiple alignment showed that the CP115 has homologues of the LNDR and TYRKLLEGE motifs that demarcate the beginning and end of the rod domain, but that these domains are 28 amino acids closer together in CP115 than in any other member of the IF family. 15 Paircoil analysis of the CP115 confirms that the predicted rod domain is shorter as well and is more in line with that predicted for the CP49. This leads to the hypothesis that CP49 and CP115 form heterodimers with central rod domains that are matched in size. This is supported by yeast two-hybrid data that support the hypothesis of a CP49-CP115 heterodimer. 15 Although the meaning of such predictions must be confirmed by experimental data, it prompts consideration of the need for mutual evolution of these two assembly partners. 
Neither IF nor BF proteins have been crystallized; thus, their structure and the filaments they form are understood in only the sketchiest of terms. It will be most interesting to finally understand exactly what fiber cell–specific functions are permitted by the unusual divergence seen in the BF proteins. Of equal interest will be the determination of how point mutations in structural proteins such as these are causative in some forms of inherited human cataract. 11 12  
 
Figure 1.
 
Nucleotide and amino acid sequence of trout CP49. Polymorphism was noted at nucleotide 296, where both G and A were represented in the population of trout studied. This variability resulted in triplet codons encoding for either amino acids S (AGC) or N (ACC). Location of identified rod domain introns are denoted by ▾ and a letter (▾A) to identify the intron for reference’s sake. Amino acids used for development of degenerate primers for initial PCR amplification are underscored. The location and orientation of nucleotides used to develop primer sets for PCR identification of putative intron B are underscored and include a ▾ for orientation.
Figure 1.
 
Nucleotide and amino acid sequence of trout CP49. Polymorphism was noted at nucleotide 296, where both G and A were represented in the population of trout studied. This variability resulted in triplet codons encoding for either amino acids S (AGC) or N (ACC). Location of identified rod domain introns are denoted by ▾ and a letter (▾A) to identify the intron for reference’s sake. Amino acids used for development of degenerate primers for initial PCR amplification are underscored. The location and orientation of nucleotides used to develop primer sets for PCR identification of putative intron B are underscored and include a ▾ for orientation.
Figure 2.
 
Nucleotide sequence of intron–exon boundaries and intron size are shown for those introns identified in the region of the trout CP49 gene encoding the central rod domain. Introns are labeled with letters in accordance with Hess et al. 14 The approximate position of the introns relative to a type I cytokeratin rod domain are shown.
Figure 2.
 
Nucleotide sequence of intron–exon boundaries and intron size are shown for those introns identified in the region of the trout CP49 gene encoding the central rod domain. Introns are labeled with letters in accordance with Hess et al. 14 The approximate position of the introns relative to a type I cytokeratin rod domain are shown.
Figure 3.
 
The sequences of CP49s from five species are aligned, and residues that are 100% conserved are identified as CP49 conserved. The 1 and 4 positions of the heptad repeats in the predicted coiled-coil domains are indicated above a given sequence as 1 and 4. Discontinuity in the heptad repeat (referred to as the stutter) is indicated by a string of asterisks. The two residues implicated in cataractogenesis in two human families with juvenile-onset autosomal dominant cataract are denoted by Image not available and Image not available . The CP49 homologues to the two well-conserved motifs found at the beginning and end of the rod domain in IF proteins are in bold. To identify type I cytokeratin residues that are strongly conserved, 10 human cytokeratins (cytokeratins 10 and 12-20), and examples of type 1 cytokeratins from cow, mouse, goldfish, chicken, sheep, and Xenopus were aligned, and residues that were 100% conserved were identified. The conserved residues are indicated as Type I Conserved. Tr, trout; Ch, chicken; Hu, human; Bo, bovine; Mo, mouse.
Figure 3.
 
The sequences of CP49s from five species are aligned, and residues that are 100% conserved are identified as CP49 conserved. The 1 and 4 positions of the heptad repeats in the predicted coiled-coil domains are indicated above a given sequence as 1 and 4. Discontinuity in the heptad repeat (referred to as the stutter) is indicated by a string of asterisks. The two residues implicated in cataractogenesis in two human families with juvenile-onset autosomal dominant cataract are denoted by Image not available and Image not available . The CP49 homologues to the two well-conserved motifs found at the beginning and end of the rod domain in IF proteins are in bold. To identify type I cytokeratin residues that are strongly conserved, 10 human cytokeratins (cytokeratins 10 and 12-20), and examples of type 1 cytokeratins from cow, mouse, goldfish, chicken, sheep, and Xenopus were aligned, and residues that were 100% conserved were identified. The conserved residues are indicated as Type I Conserved. Tr, trout; Ch, chicken; Hu, human; Bo, bovine; Mo, mouse.
Figure 4.
 
Immunoblot of trout (lane A) and bovine (lane B) buffer-insoluble lens fractions, probed with antiserum raised against recombinant mouse CP49. The retarded mobility of the trout CP49 relative to the bovine CP49 is evident.
Figure 4.
 
Immunoblot of trout (lane A) and bovine (lane B) buffer-insoluble lens fractions, probed with antiserum raised against recombinant mouse CP49. The retarded mobility of the trout CP49 relative to the bovine CP49 is evident.
Figure 5.
 
Results of Paircoil analysis are plotted for eight human type I cytokeratins. (ag) Human (h) cytokeratins (K) 10, 19, 20, 12, 16, 15, 14; (hl) are CP49s. The x-axis plots amino acid number. To permit alignment of the plots, the same sized fragment was used from each protein. The fragment includes the entire rod domain, plus small flanking sequences from each end. The y-axis plots coiled-coil probability on a 0 to 1 scale. The horizontal dotted line across each plot is the 0.5 value. Each central rod domain contains regions of α helical coiled-coil (gray boxes in schematics, labeled 1a, 1b, and 2) and nonhelical linker regions (black lines in schematics). Schematics that contrast the distribution of coiled-coil domains in the type I cytokeratins and the CP49s are between (g) and (h).
Figure 5.
 
Results of Paircoil analysis are plotted for eight human type I cytokeratins. (ag) Human (h) cytokeratins (K) 10, 19, 20, 12, 16, 15, 14; (hl) are CP49s. The x-axis plots amino acid number. To permit alignment of the plots, the same sized fragment was used from each protein. The fragment includes the entire rod domain, plus small flanking sequences from each end. The y-axis plots coiled-coil probability on a 0 to 1 scale. The horizontal dotted line across each plot is the 0.5 value. Each central rod domain contains regions of α helical coiled-coil (gray boxes in schematics, labeled 1a, 1b, and 2) and nonhelical linker regions (black lines in schematics). Schematics that contrast the distribution of coiled-coil domains in the type I cytokeratins and the CP49s are between (g) and (h).
Figure 6.
 
Family tree plot of the human IF proteins. Twenty-seven human IF proteins were submitted. It is evident that the protein clustering reflects the type grouping that has been historically used—a grouping based in primary sequence, gene structure, and tissue distribution. Type 1 cytokeratins (K9–10, 12–20), type II cytokeratins (K1–8), type III IF proteins, type IV neurofilament proteins (NFH, NFM, NFL), and the BF proteins (CP49 and CP115). Vim, vimentin; GFAP, glial fibrillary acidic protein; Des, desmin; Per, peripherin.
Figure 6.
 
Family tree plot of the human IF proteins. Twenty-seven human IF proteins were submitted. It is evident that the protein clustering reflects the type grouping that has been historically used—a grouping based in primary sequence, gene structure, and tissue distribution. Type 1 cytokeratins (K9–10, 12–20), type II cytokeratins (K1–8), type III IF proteins, type IV neurofilament proteins (NFH, NFM, NFL), and the BF proteins (CP49 and CP115). Vim, vimentin; GFAP, glial fibrillary acidic protein; Des, desmin; Per, peripherin.
Maisel H, Perry MM. Electron microscope observations on some structural proteins of the chick lens. Exp Eye Res. 1972;14:7–12. [CrossRef] [PubMed]
Gounari F, Merdes A, Quinlan R, Hess J, FitzGerald PG, Ouzounis CA, et al. Bovine filensin possesses primary and secondary structure similarity to intermediate filament proteins. J Cell Biol. 1993;121:847–853. [CrossRef] [PubMed]
Ireland M, Maisel H. A cytoskeletal protein unique to lens fiber cell differentiation. Exp Eye Res. 1984;38:637–645. [CrossRef] [PubMed]
Ireland M, Maisel H. A family of lens fiber cell specific proteins. Lens Eye Toxic Res. 1989;6:623–638. [PubMed]
FitzGerald PG, Gottlieb W. The Mr 115 kd fiber cell-specific protein is a component of the lens cytoskeleton. Curr Eye Res. 1989;8:801–811. [CrossRef] [PubMed]
FitzGerald PG, Graham D. Ultrastructural localization of alpha A-crystallin to the bovine lens fiber cell cytoskeleton. Curr Eye Res. 1991;10:417–436. [CrossRef] [PubMed]
FitzGerald PG. Methods for the circumvention of problems associated with the study of the ocular lens plasma membrane-cytoskeleton complex. Curr Eye Res. 1990;9:1083–1097. [CrossRef] [PubMed]
FitzGerald PG, Casselman J. Immunologic conservation of the fiber cell beaded filament. Curr Eye Res. 1991;10:471–478. [CrossRef] [PubMed]
Blankenship TN, Hess JF, FitzGerald PG. Development- and differentiation-dependent reorganization of intermediate filaments in fiber cells. Invest Ophthalmol Vis Sci. 2001;42:735–742. [PubMed]
Sandilands A, Prescott AR, Carter JM, et al. Vimentin and CP49/filensin form distinct networks in the lens which are independently modulated during lens fibre cell differentiation. J Cell Sci. 1995;108:1397–1406. [PubMed]
Conley YP, Erturk D, Keverline A, et al. A juvenile-onset, progressive cataract locus on chromosome 3q21–q22 is associated with a missense mutation in the beaded filament structural protein-2. Am J Hum Genet. 2000;66:1426–1431. [CrossRef] [PubMed]
Jakobs PM, Hess JF, FitzGerald PG, Kramer P, Weleber RG, Litt M. Autosomal-dominant congenital cataract associated with a deletion mutation in the human beaded filament protein gene BFSP2. Am J Hum Genet. 2000;66:1432–1436. [CrossRef] [PubMed]
Hess JF, Casselman JT, FitzGerald PG. cDNA analysis of the 49 kDa lens fiber cell cytoskeletal protein: a new, lens-specific member of the intermediate filament family?. Curr Eye Res. 1993;12:77–88. [CrossRef] [PubMed]
Hess JF, Casselman JT, FitzGerald PG. Gene structure and cDNA sequence identify the beaded filament protein CP49 as a highly divergent type I intermediate filament protein. J Biol Chem. 1996;271:6729–6735. [CrossRef] [PubMed]
Hess JF, Casselman JT, Kong AP, FitzGerald PG. Primary sequence, secondary structure, gene structure, and assembly properties suggests that the lens-specific cytoskeletal protein filensin represents a novel class of intermediate filament protein. Exp Eye Res. 1998;66:625–644. [CrossRef] [PubMed]
Remington SG. Chicken filensin: a lens fiber cell protein that exhibits sequence similarity to intermediate filament proteins. J Cell Sci. 1993;105:1057–1068. [PubMed]
Sawada K, Agata J, Eguchi G, Quinlan R, Maisel H. The predicted structure of chick lens CP49 and a variant thereof, CP49ins, the first vertebrate cytoplasmic intermediate filament protein with a lamin-like insertion in helix 1B. Curr Eye Res. 1995;14:545–553. [CrossRef] [PubMed]
Orii H, Agata K, Sawada K, Eguchi G, Maisel H. Evidence that the chick lens cytoskeletal protein CP 49 belongs to the family of intermediate filament proteins. Curr Eye Res. 1993;12:583–588. [CrossRef] [PubMed]
Masaki S, Quinlan RA. Gene structure and sequence comparisons of the eye lens specific protein, filensin, from rat and mouse: implications for protein classification and assembly. Gene. 1997;201:11–20. [CrossRef] [PubMed]
Masaki S, Watanabe T. cDNA sequence analysis of CP94: rat lens fiber cell beaded-filament structural protein shows homology to cytokeratins. Biochem Biophys Res Commun. 1992;186:190–198. [CrossRef] [PubMed]
Chomczynski P, Sacchi N. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem. 1987;162:156–159. [PubMed]
Corpet F. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 1988;16:10881–10890. [CrossRef] [PubMed]
Berger B, Wilson DB, Wolf E, Tonchev T, Milla M, Kim PS. Predicting coiled coils by use of pairwise residue correlations. Proc Natl Acad Sci USA. 1995;92:8259–8263. [CrossRef] [PubMed]
Dodemont H, Riemer D, Weber K. Structure of an invertebrate gene encoding cytoplasmic intermediate filament (IF) proteins: implications for the origin and the diversification of IF proteins. EMBO J. 1990;9:4083–4094. [PubMed]
Zewe M, Hoger TH, Fink T, Lichter P, Krohne G, Franke WW. Gene structure and chromosomal localization of the murine lamin B2 gene. Eur J Cell Biol. 1991;56:342–350. [PubMed]
Herrmann HH, Robin J, eds. Intermediate Filaments. New York: Plenum Press; 1998.
Fuchs E, Weber K. Intermediate filaments: structure, dynamics, function, and disease. Annu Rev Biochem. 1994;63:345–382. [CrossRef] [PubMed]
Albers K, Fuchs E. The molecular biology of intermediate filament proteins. Int Rev Cytol. 1992;134:243–279. [PubMed]
Lin F, Worman HJ. Structural organization of the human gene encoding nuclear lamin A and nuclear lamin C. J Biol Chem. 1993;268:16321–16326. [PubMed]
Wallace P, Signer E, Paton IR, Burt D, Quinlan R. The chicken CP49 gene contains an extra exon compared to the human CP49 gene which identifies an important step in the evolution of the eye lens intermediate filament proteins. Gene. 1998;211:19–27. [CrossRef] [PubMed]
Lussier M, Filion M, Compton JG, Nadeau JH, Lapointe L, Royal A. The mouse keratin 19-encoding gene: sequence, structure and chromosomal assignment. Gene. 1990;95:203–213. [CrossRef] [PubMed]
Chan YM, Yu QC, Fine JD, Fuchs E. The genetic basis of Weber-Cockayne epidermolysis bullosa simplex. Proc Natl Acad Sci USA. 1993;90:7414–7418. [CrossRef] [PubMed]
Chan Y, Anton-Lamprecht I, Yu QC, et al. A human keratin 14 “knockout”: the absence of K14 leads to severe epidermolysis bullosa simplex and a function for an intermediate filament protein. Genes Dev. 1994;8:2574–2587. [CrossRef] [PubMed]
Coulombe PA, Fuchs E. Elucidating the early stages of keratin filament assembly. J Cell Biol. 1990;111:153–169. [CrossRef] [PubMed]
Coulombe PA, Fuchs E. Epidermolysis bullosa simplex. Semin Dermatol. 1993;12:173–190. [PubMed]
Fuchs E, Esteves RA, Coulombe PA. Transgenic mice expressing a mutant keratin 10 gene reveal the likely genetic basis for epidermolytic hyperkeratosis. Proc Natl Acad Sci USA. 1992;89:6906–6910. [CrossRef] [PubMed]
Fuchs E, Coulombe PA. Of mice and men: genetic skin diseases of keratin. Cell. 1992;69:899–902. [CrossRef] [PubMed]
Fuchs E, Coulombe P, Cheng J, et al. Genetic bases of epidermolysis bullosa simplex and epidermolytic hyperkeratosis. J Invest Dermatol. 1994;103(Suppl 5)25S–30S. [CrossRef] [PubMed]
Steinert PM, Marekov LN, Parry DA. Diversity of intermediate filament structure. Evidence that the alignment of coiled-coil molecules in vimentin is different from that in keratin intermediate filaments. J Biol Chem. 1993;268:24916–24925. [PubMed]
Fuchs E. Intermediate filaments and disease: mutations that cripple cell strength. J Cell Biol. 1994;125:511–516. [CrossRef] [PubMed]
Goulielmos G, Gounari F, Remington S, et al. Filensin and phakinin form a novel type of beaded intermediate filaments and coassemble de novo in cultured cells. J Cell Biol. 1996;132:643–655. [CrossRef] [PubMed]
Carter JM, Hutcheson AM, Quinlan RA. In vitro studies on the assembly properties of the lens proteins CP49, CP115: coassembly with alpha-crystallin but not with vimentin. Exp Eye Res. 1995;60:181–192. [CrossRef] [PubMed]
Figure 1.
 
Nucleotide and amino acid sequence of trout CP49. Polymorphism was noted at nucleotide 296, where both G and A were represented in the population of trout studied. This variability resulted in triplet codons encoding for either amino acids S (AGC) or N (ACC). Location of identified rod domain introns are denoted by ▾ and a letter (▾A) to identify the intron for reference’s sake. Amino acids used for development of degenerate primers for initial PCR amplification are underscored. The location and orientation of nucleotides used to develop primer sets for PCR identification of putative intron B are underscored and include a ▾ for orientation.
Figure 1.
 
Nucleotide and amino acid sequence of trout CP49. Polymorphism was noted at nucleotide 296, where both G and A were represented in the population of trout studied. This variability resulted in triplet codons encoding for either amino acids S (AGC) or N (ACC). Location of identified rod domain introns are denoted by ▾ and a letter (▾A) to identify the intron for reference’s sake. Amino acids used for development of degenerate primers for initial PCR amplification are underscored. The location and orientation of nucleotides used to develop primer sets for PCR identification of putative intron B are underscored and include a ▾ for orientation.
Figure 2.
 
Nucleotide sequence of intron–exon boundaries and intron size are shown for those introns identified in the region of the trout CP49 gene encoding the central rod domain. Introns are labeled with letters in accordance with Hess et al. 14 The approximate position of the introns relative to a type I cytokeratin rod domain are shown.
Figure 2.
 
Nucleotide sequence of intron–exon boundaries and intron size are shown for those introns identified in the region of the trout CP49 gene encoding the central rod domain. Introns are labeled with letters in accordance with Hess et al. 14 The approximate position of the introns relative to a type I cytokeratin rod domain are shown.
Figure 3.
 
The sequences of CP49s from five species are aligned, and residues that are 100% conserved are identified as CP49 conserved. The 1 and 4 positions of the heptad repeats in the predicted coiled-coil domains are indicated above a given sequence as 1 and 4. Discontinuity in the heptad repeat (referred to as the stutter) is indicated by a string of asterisks. The two residues implicated in cataractogenesis in two human families with juvenile-onset autosomal dominant cataract are denoted by Image not available and Image not available . The CP49 homologues to the two well-conserved motifs found at the beginning and end of the rod domain in IF proteins are in bold. To identify type I cytokeratin residues that are strongly conserved, 10 human cytokeratins (cytokeratins 10 and 12-20), and examples of type 1 cytokeratins from cow, mouse, goldfish, chicken, sheep, and Xenopus were aligned, and residues that were 100% conserved were identified. The conserved residues are indicated as Type I Conserved. Tr, trout; Ch, chicken; Hu, human; Bo, bovine; Mo, mouse.
Figure 3.
 
The sequences of CP49s from five species are aligned, and residues that are 100% conserved are identified as CP49 conserved. The 1 and 4 positions of the heptad repeats in the predicted coiled-coil domains are indicated above a given sequence as 1 and 4. Discontinuity in the heptad repeat (referred to as the stutter) is indicated by a string of asterisks. The two residues implicated in cataractogenesis in two human families with juvenile-onset autosomal dominant cataract are denoted by Image not available and Image not available . The CP49 homologues to the two well-conserved motifs found at the beginning and end of the rod domain in IF proteins are in bold. To identify type I cytokeratin residues that are strongly conserved, 10 human cytokeratins (cytokeratins 10 and 12-20), and examples of type 1 cytokeratins from cow, mouse, goldfish, chicken, sheep, and Xenopus were aligned, and residues that were 100% conserved were identified. The conserved residues are indicated as Type I Conserved. Tr, trout; Ch, chicken; Hu, human; Bo, bovine; Mo, mouse.
Figure 4.
 
Immunoblot of trout (lane A) and bovine (lane B) buffer-insoluble lens fractions, probed with antiserum raised against recombinant mouse CP49. The retarded mobility of the trout CP49 relative to the bovine CP49 is evident.
Figure 4.
 
Immunoblot of trout (lane A) and bovine (lane B) buffer-insoluble lens fractions, probed with antiserum raised against recombinant mouse CP49. The retarded mobility of the trout CP49 relative to the bovine CP49 is evident.
Figure 5.
 
Results of Paircoil analysis are plotted for eight human type I cytokeratins. (ag) Human (h) cytokeratins (K) 10, 19, 20, 12, 16, 15, 14; (hl) are CP49s. The x-axis plots amino acid number. To permit alignment of the plots, the same sized fragment was used from each protein. The fragment includes the entire rod domain, plus small flanking sequences from each end. The y-axis plots coiled-coil probability on a 0 to 1 scale. The horizontal dotted line across each plot is the 0.5 value. Each central rod domain contains regions of α helical coiled-coil (gray boxes in schematics, labeled 1a, 1b, and 2) and nonhelical linker regions (black lines in schematics). Schematics that contrast the distribution of coiled-coil domains in the type I cytokeratins and the CP49s are between (g) and (h).
Figure 5.
 
Results of Paircoil analysis are plotted for eight human type I cytokeratins. (ag) Human (h) cytokeratins (K) 10, 19, 20, 12, 16, 15, 14; (hl) are CP49s. The x-axis plots amino acid number. To permit alignment of the plots, the same sized fragment was used from each protein. The fragment includes the entire rod domain, plus small flanking sequences from each end. The y-axis plots coiled-coil probability on a 0 to 1 scale. The horizontal dotted line across each plot is the 0.5 value. Each central rod domain contains regions of α helical coiled-coil (gray boxes in schematics, labeled 1a, 1b, and 2) and nonhelical linker regions (black lines in schematics). Schematics that contrast the distribution of coiled-coil domains in the type I cytokeratins and the CP49s are between (g) and (h).
Figure 6.
 
Family tree plot of the human IF proteins. Twenty-seven human IF proteins were submitted. It is evident that the protein clustering reflects the type grouping that has been historically used—a grouping based in primary sequence, gene structure, and tissue distribution. Type 1 cytokeratins (K9–10, 12–20), type II cytokeratins (K1–8), type III IF proteins, type IV neurofilament proteins (NFH, NFM, NFL), and the BF proteins (CP49 and CP115). Vim, vimentin; GFAP, glial fibrillary acidic protein; Des, desmin; Per, peripherin.
Figure 6.
 
Family tree plot of the human IF proteins. Twenty-seven human IF proteins were submitted. It is evident that the protein clustering reflects the type grouping that has been historically used—a grouping based in primary sequence, gene structure, and tissue distribution. Type 1 cytokeratins (K9–10, 12–20), type II cytokeratins (K1–8), type III IF proteins, type IV neurofilament proteins (NFH, NFM, NFL), and the BF proteins (CP49 and CP115). Vim, vimentin; GFAP, glial fibrillary acidic protein; Des, desmin; Per, peripherin.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×