AffyProbeMiner logo Department of Biostatistics, Bioinformatics and Biomathematics Center for Information Technology Genomics and Bioinformatics Group

AffyProbeMiner

The problem:
Between the time the probes for a given short oligonucleotide chip were designed, and the time an analysis is made, the knowledge of expected transcripts for a given organism might have changed. Unless one includes the latest development in transcripts into an analysis, the analysis could suffer from what we like to call a Dorian Gray effect. The chip itself does not change, which means that the probes and their respective sequences remain the same, while the knowledge of the transcripts, and eventually their sequence, might evolve, and in time the immobility of the probe and probe sets give an uglier picture of the biological phenomena to study.
-- Laurent Gautier [in Alternative CDF environments ]

The solution: AffyProbeMiner consists of a set of Perl programs to

  • generate a collection of complete coding sequences (CCDSs) composed of
    • RefSeq records with accessions (e.g. NM_012345)
    • CCDSs in GenBank
  • regroup probes in AffyMetrix chips into probe sets. The probes in a probe set map to a consistent set of CCDSs

The practical consequence of the improved grouping and mapping is that the individual probes are members of a consistent probe set - that is, they hybridize to the same mRNA transcript (or set of mRNA transcripts), and therefore are measuring the same entity (or entities).

AffyProbeMiner Features Relationship of original and remapped probe sets

Download Remapped CDF
 
CDFs generated for all Affymetrix chips are available in Affymetrix CDF format or in Bioconductor R package format for download.
 
 
 
Web Server
 
A file of probe sequences in FASTA format can be uploaded, and remapped CDFs will be generated from the CCDS database.
 
 
 
Download Remapping Software
 
All Perl programs are available for download.
 Download Excel spreadsheet to rotate the 3D histogram




AffyProbeMiner is a joint development of the Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology (LMP), Center for Cancer Research (CCR) National Cancer Institute (NCI) and Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center. If you have any problems, questions or feedback on the tool, please email us.

Notice and Disclaimer