| Application Build: 246 Database Build: 2008-04 |
| Home | High-Throughput | Getting Started | Requirements | Installation | Downloads | Command Line | Database | FAQ | News | Citing | GoMiner in Papers | Credits |
There are many builds of the GO MySQL database that have a problem with some of the data from PDB. A discussion of this problem suggests that problem started in October 2003. Among the manifestations of the problem is that there are more than 17,000 entries for ‘A’ in the symbols field of the gene_product table. Typically, there will be 1 or 2 entries for any given identifier. The problem is currently being corrected in future builds of the GO database.
We first noticed the problem when GoMiner choked when trying to process 17,000 entries for the same gene name. It also turns out that A really is a valid gene name. The current release of GoMiner has a filter that looks for evidence of the parsing problem, and if present in the target database it applies a filter. The filter ignores gene symbols in the GO database if they are of length 1 and they come from PDB. This preserves the ability of GoMiner to analyze the handful of legitimate short gene symbols, and avoid the erroneous entries caused by the parsing problem.
We would like to hear from you. You can reach the team via email.
GoMiner was originally developed jointly by the Genomics and Bioinformatics Group (GBG) of LMP, NCI, NIH and the Medical Informatics and Bioimaging group of BME, Georgia Tech/Emory University. It is now maintained and under continuing development by GBG.