Search Keck Sites:


Services:


Additional Information:



W.M. Keck Facility
 Yale University
 300 George Street
 Addresses
 
 Contact Us

Yale University School of Medicine


NIDDK Biotechnology Consortium Home

 

Keck Home Page > DNA Microarray Resource > Gene Expression Omnibus

 GENE EXPRESSION OMNIBUS

From the GEO site:

"Microarrays are a significant advance in molecular biology both because they are able to assay a very large number of genes or sequences and because of their requirement for relatively small amount of starting RNA. Today, proficiency in generating data is fast overcoming the capacity for storing and analyzing it. Much of this information is scattered across the Internet or is not even available to the public. As more laboratories acquire this technology, the problem will only get worse. This avalanche of data requires standardization of storage, sharing, and publishing techniques.

In order to support the public use and dissemination of gene expression data, NCBI has launched the Gene Expression Omnibus, or GEO. GEO is a gene expression and hybridization array data repository, as well as an online resource for the retrieval of gene expression data from any organism or artificial source. Many types of gene expression data from platform types such as nucleotide, antibody and tissue arrays and serial analysis of gene expression (SAGE) data, will be accepted, accessioned, and archived as public data sets".

Link to FAQs - GEO General FAQ

At the most basic level of organization of GEO, there are four entity types that may be supplied by users:

Submitter

Contains contact and authentication information about the submitter. A submitter entity may have relationships to many platforms, many samples, and many series.

Platform

Describes the list of elements on the array (e.g., cDNAs, oligonucleotide probesets, ORFs, antibodies) or the list of elements in that experiment. Each platform record is assigned a unique and stable GEO accession number (GPLxxx). A platform may reference many samples that have been submitted by multiple submitters.

Sample

Describes the conditions under which an individual sample was handled, the manipulations it underwent, and the measurement of each element derived from it. Each sample record is assigned a unique and stable GEO accession number (GSMxxx). A sample entity must reference only one platform and may be included in multiple series.

Series

Defines a set of related samples considered to be part of a group, how the samples are related, and if and how they are ordered. A series provides a focal point and description of the experiment as a whole. Series records may also contain tables describing extracted data, summary conclusions, or analyses. Each series record is assigned a unique and stable GEO accession number (GSExxx).

Keck GEO information:

The following Keck Arrays have been assigned GEO Platform IDs or GEO accession number- GPL's

GEO Platform ID

Keck Array Name

Description

GPL993

HU4.6K Human 4.6K cDNA array - Research Genetics - 4,608 spots
No Platform GPL MEPC3.4K Mouse Endocrine Pancreas Consortium 3.4K cDNA array - 3,400 spots - RETIRED
No Platform GPL MM4.6K Mouse 4.6K cDNA array - Research Genetics - 4,608 spots - RETIRED
GPL977 NIA15K Mouse 15K cDNA array - NIA Collection - 15,267 spots
GPL995 AR9.2K Arabidopsis 9.2K cDNA array - MSU/Ohio Stock - 9,216 spots - RETIRED
GPL992 AR12K Arabidopsis 12K cDNA array - MSU/Ohio Stock -11,960 spots
GPL989 OHU16K Human 16K 70mer oligo array - Operon - 16,659 spots
GPL990 OHU21K Human 21K 70mer oligo array - Operon - 21,329 spots
Platform GPL Pending OHU28K Human 28K 70mer oligo array - Operon
GPL987 OMM13K Mouse 13K 70mer oligo array - Operon - 13,443 spots - RETIRED
GPL986 OMM16K Mouse 16K 70mer oligo array - Operon - 16,463 spots
Platform GPL Pending OMM25K Mouse 25K 70mer oligo array - Operon
GPL988 OAR27K Arabidopsis 27K 70mer oligo array - Operon - 26,090 spots
GPL991 ORR4.8K Rat 4.8K 65mer oligo array - Compugen - 4,854 spots

Data may be submitted to GEO by either Web or Direct Deposit - see GEO Deposit Guide Page

Data from diverse sources must be converted to SOFT format for upload to GEO. SOFT(Simple Omnibus Format in Text) is an ASCII text format that was designed to be a machine readable representation of data retrieved from, or submitted to, GEO. SOFT is also a line-based format, making it easy to parse using commonly available text processing and formatting languages. For a complete description of SOFT format, see the SOFT guide.

  • Batch Direct Deposit in SOFT format.

If your data are already in a database, or if you have many samples to submit, it is likely that submission of data via Direct Deposit in SOFT format is the most convenient deposit route. This process was designed for rapid batch submission of data; SOFT files may be readily produced from common spreadsheet and database applications. A detailed SOFT guide and examples of SOFT documents are available for review.

  • YMD - Batch mode request for data in GEO upload SOFT format.

If your data are already in a YMD database or if you have many samples to submit, it is likely that batch submission of data via Direct Deposit in SOFT format is the most convenient deposit route. If your data is not currently in YMD database it can be submitted specifically for SOFT format generation. YMD -Yale Microarray Database has the ability to generate SOFT format files that are required for upload of data sets to GEO. Based on the information supplied in the GEO Platform Description for each array and using the information entered by the user in the experiment and hybridization fields when data is uploaded and then stored in YMD electronic "lab booK", SOFT files are created semi-automatically. Please contact Janet Hager for details.

 

    Top of Page
Medical Center Yale-New Haven Hospital Yale University

Copyright © 2002, Yale University, New Haven, Connecticut, USA. All rights reserved.
Comments or suggestions to site editor.

Last modified: 21-Feb-2007