Genomics and Bioinformatics Group Genomics and Bioinformatics Group Genomics and Bioinformatics Group
Genomics and Bioinformatics Group

Microarray Data Analysis

Genomics and Bioinformatics Group
   Home
   Tools
   Molec Maps
   Members
   Links
   Contact
   Search
  Publications
   LMP Home
 
The aim of these pages is to set out the author’s opinions about the best ways to deal with the most common issues in microarr

An (Opinionated) Guide to Microarray Data Analysis

Mark Reimers

National Cancer Institute, and Karolinska Instutitute, Dept. of Biosciences

The aim of these pages is to set out the author’s opinions about the best ways to deal with the most common issues in microarray data analysis. The opinions are based on experience with dozens of collaborating lab scientists, and discussions with microarray statisticians. Contributions, disputes, and opinions are welcome (and may be posted!). These pages are intended primarily for people working in a microarray facility. The focus is on practical biological issues. There is no attempt to cover all the fancy data mining algorithms that have been developed. However I hope that statisticians and programmers will also find interesting material here.

This site is organized as follows

Experimental design

This section discusses how many replicates are needed, whether to pool samples, and designs for two color arrays.

Pre-processing of Spotted Arrays

This section discusses steps along the way from quantification of the image, through background correction, quality control, and normalization, to obtain reliable estimates of relative gene abundance in the samples.

Pre-processing of Affymetrix Chips

The Affymetrix multiple-probe system offers a unique set of statistical challenges before one even gets to interpreting biological meaning. This section discusses image quantification and background, MAS5.0, normalization, multi-chip algorithms for estimation, and model-based quality control..

Exploratory Analysis

A preliminary examination of data confirms that groups are homogeneous. Many studies aim to find unknown co-regulated genes. This section discusses how to achieve these goals using multi-dimensional scaling, and clustering.

Statistical Tests for Identifying Differentially Expressed Genes 

This section discusses methods and problems in determining which genes are differentially expressed between groups of samples, covering t-tests, multiple-testing corrections, false discovery rate, empirical Bayes methods, and analysis of variance.

For descriptions of the technology see the Nature Genetics Chipping Report supplements

 


Genomics and Bioinformatics Group Home Page Link to Center for Cancer Research Home Page Link to National Cancer Institute Home Page Link to National Institutes of Health Link to Department of Health & Human Services Home Page