One goal of this study was to generate a comprehensive picture of gene expression in the human macula that is accurate, readily accessible, and can be used as a resource to identify and quantitate cell-type–specific or –associated genes. To this end we integrated large-scale expression data obtained from this tissue, by using different technologies: SAGE, longSAGE, and cDNA microarrays into a database that we named
EyeSAGE. Starting with the short tag retina and RPE/choroid SAGE libraries summarized in
Table 2 , 160,723 unique tags were used as the first building block (column) for the
EyeSAGE database. Each tag was analyzed and assigned a best gene match,
23 and UniGene cluster assignment (based on NCBI Build 182) if available. Genomic map positions (as nucleotide numbers along the chromosome) were assigned as previously described.
24 Columns of tag counts normalized to 200,000 for each tag in each posterior eye library were added. The
4cRET longSAGE library was incorporated by matching the longSAGE tags with their reliable best gene matches, based on CGAP’s SAGE Genie assignments, to the short tag, the sequence of which is the first 10 bases of the 17-bp tag. Next the tag counts for each tag in 39 additional normal tissue SAGE libraries (available at SAGE Genie) were added to incorporate expression information from a variety of tissue and cell types. Incyte cDNA microarray expression data of peripheral and macular retina were imported and linked to the SAGE data after using BLAT homology searches to assign a UniGene cluster number (Build 182) to each microarray probe. Using the convention at CGAP’s SAGE Genie (http://cgap.nci.nih.gov/SAGE/AnatomicViewer) the SAGE libraries were normalized to 200,000 tags for pair-wise comparisons. The entire
EyeSAGE database in Access was sorted by tag number and genes with expression (tags) totaling five or more (normalized to 200,000 therefore totaling approximately two or more raw tag counts/library), in the eight posterior-eye shortSAGE libraries combined, were exported into spreadsheet software (Excel; Microsoft; the entire Microsoft Access version of
EyeSAGE is available on request). This step removes unique tags that occur as singletons in only one retina or RPE/choroid library. This version of the
EyeSAGE database was used for subsequent data mining (available at NEIBank, http://neibank.nei.nih.gov/index.shtml). The
EyeSAGE database is an easily searchable, comprehensive expression dataset representing the posterior eye transcriptome. In its current form,
EyeSAGE can be used to analyze tissue and cell-type expression of single genes or classes of genes, or to display ocular expression over user-defined genomic regions. It can also be mined to generate large-scale views of cell-type expression. Examples of specific queries follow.