Abstract
Purpose:
The principle challenge of high-throughput expression profiling approaches is to effectively prioritize candidates that function in tissue morphogenesis or homeostasis. Recently, we demonstrated that for lens microarray datasets, an approach termed “in silico subtraction”, involving comparative analysis to a reference whole embryo body (WB) tissue dataset, allows the estimation of lens-enrichment scores for candidate genes. These scores are excellent predictors of significance to lens biology and are the basis of the web-tool iSyTE that has identified several new cataract genes. Here, we test the hypothesis that WB in silico subtraction can be extended to process lens RNA-seq data and prioritize candidates important to lens biology and cataract.
Methods:
RNA-seq data on mouse E15.5 lens tissue generated using the Illumina HiSeq platform was obtained from NCBI-GEO. In addition, new RNA-seq datasets were generated for P0 and P4 lens and WB tissue. Data pre-processing and adapter trimming were performed for each RNA-seq dataset and short reads were aligned against the Mus musculus reference genome using TopHat v2.0.9. The novel-junc function was used to estimate expression levels of gene isoforms and Cufflinks v2.1.1 was used to calculate normalized fragment counts.
Results:
RNA-seq data from E15.5, P0 and P4 mouse lens were compared to the WB RNA-seq dataset to generate expression profiles of lens-enriched candidate genes. In silico subtracted datasets for all three lens stages effectively identified known genes linked to lens development and cataract. Significantly, this analysis could distinguish between individual lens-enriched gene-isoforms. When tested for gene ontology (GO) clustering using DAVID analysis, in sharp contrast to un-subtracted lens expression profiles, in silico subtracted lens expression profiles were highly enriched in GO categories for lens development, eye-development, eye-morphogenesis and sensory-organ development, indicating the utility of this approach.
Conclusions:
In sum, these data demonstrate that in silico-subtraction analysis can be successfully applied to lens RNA-seq data to prioritize new candidate genes important for lens biology and cataract. Significantly, these analyses will serve to identify candidate genes missed by lens microarray analysis as well as to distinguish between different isoforms for individual genes expressed in the lens.