|
Keck
Home Page > DNA Microarray
Resource > Gene Expression Omnibus
GENE EXPRESSION OMNIBUS
From the GEO site:
"Microarrays are a significant advance in molecular biology both
because they are able to assay a very large number of genes or sequences and
because of their requirement for relatively small amount of starting RNA. Today,
proficiency in generating data is fast overcoming the capacity for storing and
analyzing it. Much of this information is scattered across the Internet or is
not even available to the public. As more laboratories acquire this technology,
the problem will only get worse. This avalanche of data requires standardization
of storage, sharing, and publishing techniques. In order to support the public use and
dissemination of gene expression data, NCBI has launched
the Gene
Expression Omnibus, or GEO. GEO is a gene expression and
hybridization array data repository, as well as an online resource for the
retrieval of gene expression data from any organism or artificial source. Many
types of gene expression data from platform types such as nucleotide, antibody
and tissue arrays and serial analysis of gene expression (SAGE) data, will be
accepted, accessioned, and archived as public data sets".
Link to FAQs -
GEO General FAQ
At the most basic level
of organization of GEO, there are four entity types that may be
supplied by users:
|
Submitter
|
Contains contact
and authentication information about the submitter. A submitter
entity may have relationships to many platforms, many samples, and
many series. |
|
Platform
|
Describes the list
of elements on the array (e.g., cDNAs, oligonucleotide probesets,
ORFs, antibodies) or the list of elements in that experiment. Each
platform record is assigned a unique and stable
GEO accession number (GPLxxx). A
platform may reference many samples that have been submitted by
multiple submitters. |
|
Sample
|
Describes the
conditions under which an individual sample was handled, the
manipulations it underwent, and the measurement of each element
derived from it. Each sample record is assigned a unique and stable
GEO accession number (GSMxxx).
A sample entity must reference only one platform and may be
included in multiple series. |
|
Series
|
Defines a set of
related samples considered to be part of a group, how the samples
are related, and if and how they are ordered. A series provides a
focal point and description of the experiment as a whole. Series
records may also contain tables describing extracted data, summary
conclusions, or analyses. Each series record is assigned a unique
and stable GEO accession
number (GSExxx). |
Keck GEO information:
The following Keck Arrays have been assigned
GEO Platform IDs or GEO accession number- GPL's
|
GEO Platform ID
|
Keck Array Name
|
Description
|
|
GPL993
|
HU4.6K |
Human 4.6K cDNA array -
Research Genetics - 4,608 spots |
|
No
Platform GPL |
MEPC3.4K |
Mouse Endocrine Pancreas Consortium
3.4K cDNA array - 3,400 spots -
RETIRED |
|
No
Platform GPL |
MM4.6K |
Mouse 4.6K cDNA array
- Research Genetics - 4,608 spots -
RETIRED |
|
GPL977 |
NIA15K |
Mouse 15K cDNA array
- NIA Collection - 15,267 spots |
|
GPL995 |
AR9.2K |
Arabidopsis 9.2K cDNA array - MSU/Ohio
Stock - 9,216 spots -
RETIRED |
|
GPL992 |
AR12K |
Arabidopsis 12K cDNA array - MSU/Ohio
Stock -11,960 spots |
|
GPL989 |
OHU16K |
Human 16K
70mer oligo array - Operon - 16,659 spots |
|
GPL990 |
OHU21K |
Human 21K 70mer oligo array - Operon - 21,329 spots |
|
Platform GPL
Pending |
OHU28K |
Human 28K 70mer oligo array - Operon
 |
|
GPL987 |
OMM13K |
Mouse 13K 70mer oligo array - Operon - 13,443 spots -
RETIRED |
|
GPL986 |
OMM16K |
Mouse 16K 70mer oligo array - Operon - 16,463 spots |
|
Platform GPL
Pending |
OMM25K |
Mouse 25K 70mer oligo array - Operon
 |
|
GPL988 |
OAR27K |
Arabidopsis 27K 70mer
oligo array - Operon - 26,090 spots |
|
GPL991 |
ORR4.8K |
Rat 4.8K 65mer oligo array - Compugen - 4,854 spots |
Data may be submitted to GEO by
either Web or Direct Deposit - see
GEO Deposit Guide Page
Data
from diverse sources must be
converted to SOFT format for upload to GEO.
SOFT(Simple Omnibus Format in Text)
is an ASCII text format that was designed to be a machine readable
representation of data retrieved from, or submitted to, GEO. SOFT is
also a line-based format, making it easy to parse using commonly
available text processing and formatting languages. For a complete
description of SOFT format, see the
SOFT guide.
- Batch
Direct Deposit in SOFT format.
If your data are already in a
database, or if you have many samples to submit, it is likely that
submission of data via
Direct Deposit in
SOFT format is the most convenient
deposit route. This process was designed for rapid batch submission of
data; SOFT files may be readily produced from common spreadsheet and
database applications. A detailed
SOFT guide and
examples of SOFT documents are
available for review.
If your data are already in a YMD
database or if you have many samples to submit, it is likely that batch
submission of data via
Direct Deposit in SOFT format is the
most convenient deposit route. If your data is not currently in YMD database it
can be submitted specifically for SOFT format generation. YMD -Yale
Microarray Database has
the ability to generate
SOFT format files
that are required for upload of data sets to GEO. Based on the
information supplied in the GEO Platform Description for each array and using
the information entered by the user in the experiment and hybridization
fields when data is uploaded and then stored in YMD electronic "lab booK",
SOFT files are created semi-automatically.
Please contact
Janet Hager for details.
|