Yale School of Medicine

W.M. KECK

Microarray: KEC

Microarray: KECK

Spotted Arrays

GEO

In order to support the public use and dissemination of gene expression data, NCBI has launched the Gene Expression Omnibus, or GEO. GEO is a gene expression and hybridization array data repository, as well as an online resource for the retrieval of gene expression data from any organism or artificial source. Many types of gene expression data from platform types such as nucleotide, antibody and tissue arrays and serial analysis of gene expression (SAGE) data, will be accepted, accessioned, and archived as public data sets".

 

At the most basic level of organization of GEO, there are four entity types that may be supplied by users:

Submitter

Contains contact and authentication information about the submitter. A submitter entity may have relationships to many platforms, many samples, and many series.

Platform

Describes the list of elements on the array (e.g., cDNAs, oligonucleotide probesets, ORFs, antibodies) or the list of elements in that experiment. Each platform record is assigned a unique and stable GEO accession number (GPLxxx). A platform may reference many samples that have been submitted by multiple submitters.

Sample

Describes the conditions under which an individual sample was handled, the manipulations it underwent, and the measurement of each element derived from it. Each sample record is assigned a unique and stable GEO accession number (GSMxxx). A sample entity must reference only one platform and may be included in multiple series.

Series

Defines a set of related samples considered to be part of a group, how the samples are related, and if and how they are ordered. A series provides a focal point and description of the experiment as a whole. Series records may also contain tables describing extracted data, summary conclusions, or analyses. Each series record is assigned a unique and stable GEO accession number (GSExxx).

 

Keck GEO information:

The following Keck Arrays have been assigned GEO Platform IDs or GEO accession number- GPL's

GEO Platform ID

Keck Array Name

Description

GPL977 NIA15K Mouse 15K cDNA array - NIA Collection - 15,267 spots
GPL995 AR9.2K Arabidopsis 9.2K cDNA array - MSU/Ohio Stock - 9,216 spots - RETIRED
GPL992 AR12K Arabidopsis 12K cDNA array - MSU/Ohio Stock -11,960 spots
GPL989 OHU16K Human 16K 70mer oligo array - Operon - 16,659 spots
GPL990 OHU21K Human 21K 70mer oligo array - Operon - 21,329 spots
GPL987 OMM13K Mouse 13K 70mer oligo array - Operon - 13,443 spots - RETIRED
GPL986 OMM16K Mouse 16K 70mer oligo array - Operon - 16,463 spots
GPL988 OAR27K Arabidopsis 27K 70mer oligo array - Operon - 26,090 spots
GPL991 ORR4.8K Rat 4.8K 65mer oligo array - Compugen - 4,854 spots

Data may be submitted to GEO by either Web or Direct Deposit - see GEO Deposit Guide Page

Data from diverse sources must be converted to SOFT format for upload to GEO. SOFT(Simple Omnibus Format in Text) is an ASCII text format that was designed to be a machine readable representation of data retrieved from, or submitted to, GEO. SOFT is also a line-based format, making it easy to parse using commonly available text processing and formatting languages. For a complete description of SOFT format, see the SOFT guide.

  • Batch Direct Deposit in SOFT format.

If your data are already in a database, or if you have many samples to submit, it is likely that submission of data via Direct Deposit in SOFT format is the most convenient deposit route. This process was designed for rapid batch submission of data; SOFT files may be readily produced from common spreadsheet and database applications. A detailed SOFT guide is available for review.

  • YMD - Batch mode request for data in GEO upload SOFT format.

If your data are already in a YMD database or if you have many samples to submit, it is likely that batch submission of data via Direct Deposit in SOFT format is the most convenient deposit route. If your data is not currently in YMD database it can be submitted specifically for SOFT format generation. YMD -Yale Microarray Database has the ability to generate SOFT format files that are required for upload of data sets to GEO. Based on the information supplied in the GEO Platform Description for each array and using the information entered by the user in the experiment and hybridization fields when data is uploaded and then stored in YMD electronic "lab booK", SOFT files are created semi-automatically.