|
Keck
Home Page >
Biostatistics Resource >
Overview
MICROARRAY DATA ANALYSIS OVERVIEW
Microarray technology is a very powerful tool
for medical and biological research which allows the monitoring of expression
levels of thousands of genes simultaneously. Performing microarray experiments
and getting the results is not the end but just the beginning. Microarray
experiments generate overwhelmingly large amount of data. In order to make sense
out of this data one needs to use sophisticated statistical software and tools.
Various sources have developed many software packages for analyzing microarray
data. We also have developed some advanced statistical analysis methods in
house. It has been amply proven that the analysis results provide in depth
understanding on gene regulations.
Microarray data can be analyzed using several approaches based on research
goals. The basic approach of microarray data analysis is simple discrimination
of differentially expressed genes. Clustering analysis is used widely to
identify clusters of genes with correlated patterns of expression.
Classification methods have proven very useful to identify patterns of gene
expression that can be correlated with diagnostic classification and for
classifying genes according to their functional role. Based on clustering
analysis results and other information, data can be interpreted with respect to
biological pathways. Certainly, there are other analyses such as functional
annotation can be done based on experimental design and the scientific questions
you are going to address.
Identification of Differentially Expressed Genes
A natural first step in extracting microarray data information is to examine the
genes with significant differential expressions in individual samples or
different conditions. This simple technique is extremely efficient, for example,
in screens for drug targets. As there are thousands of genes in a microarray
chip, it is neither possible nor necessary to follow up all the genes. Analyses
such as T test, ANOVA, and other tests can identify which genes show good
evidence of being differentially expressed. The genes could be initially ranked
in order of evidence for differential expression from strongest to weakest
evidence. Then a cut-off value could be chosen to select a subset of genes based
on a given criterion, either statistical or biological. By the simple
differential analysis, the genes to be followed will be reduced from several
thousands to hundreds or dozens. These are the candidate genes for confirmation
and further study.
Cluster Analysis
Beyond identification of differentially expressed genes, clustering of genes
from multiple experiments into groups with similar expression patterns is
required for further function annotation and diagnostic classification. Genes
clustered in the same group share similar expression profile, which give clues
that the unknown genes may have functions or pathways of the respective groups
they cluster in. Hierarchical and nonhierarchical algorithms such as k-means,
self-organizing maps, principal component analysis and other methods have been
implemented to cluster similar expressed genes and expression patterns.
 
Classification
Traditional disease classification is mainly based on morphology, pathology and
biochemistry. However, classification at the gene expression level will be more
accurate and more useful for diagnosis and treatment.
DNA microarray experiments generate thousands of gene expression data from
tissue and cell samples regarding gene expression profiles. The data can be used
to discriminate between different types of tissues. The data can also be used to
identify new subclasses of an existing class of phenotype. This means that
microarray techniques may lead to a more complete understanding of the molecular
variations among individuals at the gene level. The challenge of disease
treatment is to target specific therapies to genetically distinct disease types
to maximize efficacy and minimize toxicity. Improvements in classification have
thus been crucial to advances in disease treatment.
|