Molecular Biotechnology Services
PO Box 201
300 George Street
New Haven, CT 06511
Tel: 203.785.7869
Fax: 203.785.7919
microarrays@yale.edu
In order to support the public use and dissemination of gene expression data, NCBI has launched the Gene Expression Omnibus, or GEO. GEO is a gene expression and hybridization array data repository, as well as an online resource for the retrieval of gene expression data from any organism or artificial source. Many types of gene expression data from platform types such as nucleotide, antibody and tissue arrays and serial analysis of gene expression (SAGE) data, will be accepted, accessioned, and archived as public data sets".
At the most basic level of organization of GEO, there are four entity types that may be supplied by users:
| Submitter | Contains contact and authentication information about the submitter. A submitter entity may have relationships to many platforms, many samples, and many series. |
| Platform | Describes the list of elements on the array (e.g., cDNAs, oligonucleotide probesets, ORFs, antibodies) or the list of elements in that experiment. Each platform record is assigned a unique and stable GEO accession number (GPLxxx). A platform may reference many samples that have been submitted by multiple submitters. |
| Sample | Describes the conditions under which an individual sample was handled, the manipulations it underwent, and the measurement of each element derived from it. Each sample record is assigned a unique and stable GEO accession number (GSMxxx). A sample entity must reference only one platform and may be included in multiple series. |
| Series | Defines a set of related samples considered to be part of a group, how the samples are related, and if and how they are ordered. A series provides a focal point and description of the experiment as a whole. Series records may also contain tables describing extracted data, summary conclusions, or analyses. Each series record is assigned a unique and stable GEO accession number (GSExxx). |
Keck GEO information:
The following Keck Arrays have been assigned GEO Platform IDs or GEO accession number- GPL's
GEO Platform ID |
Keck Array Name | Description |
| GPL977 | NIA15K | Mouse 15K cDNA array - NIA Collection - 15,267 spots |
| GPL995 | AR9.2K | Arabidopsis 9.2K cDNA array - MSU/Ohio Stock - 9,216 spots - RETIRED |
| GPL992 | AR12K | Arabidopsis 12K cDNA array - MSU/Ohio Stock -11,960 spots |
| GPL989 | OHU16K | Human 16K 70mer oligo array - Operon - 16,659 spots |
| GPL990 | OHU21K | Human 21K 70mer oligo array - Operon - 21,329 spots |
| GPL987 | OMM13K | Mouse 13K 70mer oligo array - Operon - 13,443 spots - RETIRED |
| GPL986 | OMM16K | Mouse 16K 70mer oligo array - Operon - 16,463 spots |
| GPL988 | OAR27K | Arabidopsis 27K 70mer oligo array - Operon - 26,090 spots |
| GPL991 | ORR4.8K | Rat 4.8K 65mer oligo array - Compugen - 4,854 spots |
Data may be submitted to GEO by either Web or Direct Deposit - see GEO Deposit Guide Page
Data from diverse sources must be converted to SOFT format for upload to GEO. SOFT(Simple Omnibus Format in Text) is an ASCII text format that was designed to be a machine readable representation of data retrieved from, or submitted to, GEO. SOFT is also a line-based format, making it easy to parse using commonly available text processing and formatting languages. For a complete description of SOFT format, see the SOFT guide.
If your data are already in a database, or if you have many samples to submit, it is likely that submission of data via Direct Deposit in SOFT format is the most convenient deposit route. This process was designed for rapid batch submission of data; SOFT files may be readily produced from common spreadsheet and database applications. A detailed SOFT guide is available for review.
YMD - Batch mode request for data in GEO upload SOFT format.
If your data are already in a YMD database or if you have many samples to submit, it is likely that batch submission of data via Direct Deposit in SOFT format is the most convenient deposit route. If your data is not currently in YMD database it can be submitted specifically for SOFT format generation. YMD -Yale Microarray Database has the ability to generate SOFT format files that are required for upload of data sets to GEO. Based on the information supplied in the GEO Platform Description for each array and using the information entered by the user in the experiment and hybridization fields when data is uploaded and then stored in YMD electronic "lab booK", SOFT files are created semi-automatically.