We sought to examine the transcriptomes of isolated lens epithelium and fibers generated in various studies performed on different stages, spanning embryonic and early postnatal development, through adulthood and aging. We identified several publicly available transcriptome datasets generated on microarrays or by RNA-seq that represent stages E12.5, E14.5, E16.5, E18.5, P0a (Cvekl data), P0b (Robinson data), P13, age three months, six months, and two years that could be used in this meta-analysis (
Table,
Supplementary Table S1). Of these, two datasets were on microarray platforms (E12.5, P13) whereas eight were generated by high-throughput RNA-seq. Meta-analysis was performed on these eight RNA-seq lens expression datasets (E14.5, E16.5, P18.5, P0a, P0b, age three months, six months, and two years). We first focused on meta-analysis of the isolated lens epithelium and fiber cell RNA-seq data by following the outlined workflow for each of these datasets (
Fig. 1). This is important because the application of the same workflow to reanalyze these datasets serves to minimize variables that arise from the different analytical methods used in the original reports. This now allows for comparable downstream studies between these datasets. This RNA-seq meta-analysis identified 15,411 genes to be expressed in the lens epithelium or fiber cells in at least one stage (log
2CPM >0) (
Fig. 2A). We next identified 2841 genes with robust expression (log
2CPM >1) that were also significantly differentially expressed (FDR <0.05) in the lens epithelium or fiber cells across all developmental stages (E14.5, E16.5, P18.5, P0a, P0b, age three months, six months, and two years) (
Fig. 2A,
Supplementary Table S2). A heat-map representation of the differential expression of these 2841 candidates shows separation of epithelium and fiber expressed genes (
Fig. 2B). Furthermore, these differentially expressed genes also appropriately segregate according to the age of the lens, regardless of the origin of the sample (
Fig. 2B). For example, the three-month data (generated in the Duncan laboratory) and six-month data (generated in the Fan laboratory) cluster closer compared to the three-month and two-year data, both simultaneously generated in the same laboratory (Duncan laboratory). Furthermore, the embryonic/early postnatal data (E14.5 through P0; generated by two different laboratories, namely, Cvekl and Robinson) cluster together while segregating separately from the adult/aged data (three months through two years). We examined the 2841 significantly differentially expressed genes across all the stages (fold-change (|log
2FC|>1)) where candidates showed distinct differential expression (e.g., higher in epithelium compared to fiber and vice versa) or dynamic differential expression (e.g., higher in epithelium compared to fiber and vice versa at certain stages, but exhibiting an opposite pattern in other stages). This identifies 1145 genes as higher in epithelium and 1427 genes as higher in fiber (
Fig. 2A). This also identifies 172 genes that do not pass the stringent (|log
2FC|>1) fold-change threshold. These genes are differentially expressed to a lower extent; they are still significant in terms of statistical power for differential expression at all the stages. Finally, 97 genes are found to exhibit dynamic differential expression in epithelium and fiber cells at different stages. A few representative genes that exhibit these trends are shown (
Fig. 2C).