High-Throughput GoMiner Detailed Output Descriptions

A subdirectory is created within the data directory, and its name is based upon that of the total gene file. It contains two classes of objects:

  1. Several files that summarize the overall results from all of the changed gene files:
    • Simple summary of the results from each experiment (ie, changed gene file) [TOTALGENESFILE.report]. Each line contains the name of the changed gene file and the number of categories whose FDR is less than or equal to a user-defined threshold. Sorted in ascending order by this number of categories so that the user can immediately ascertain which experiments are of interest for further analysis. Separate results are given for underexpressed, overexpressed, and changed genes for each changed gene file that had been input in the two column format.
    • Tab-delimited files containing a matrix whose rows are categories and whose columns are names of changed gene files [total.txt.change.series.CIM]. The former is the union of categories whose FDR meet a user-defined threshold T for at least one of the changed gene files. Each entry is given as T - 0.9*FDR (or as 0 if T - FDR is negative). Since the matrix data are intended to be viewed in clustered image map (CIM) programs or Excel using the 3D columns visualization format, this transformation makes it easier view the important categories (ie, those with low FDR), since the eye is drawn to the higher 3D columns. For example, each experiment might represent a different time point in a time series. In this case, the Excel visualization would show the coming and going of categories that were important at different times during the experiment. The same tab-delimited files can be used as input to any one of several publicly available CIM programs in order to view the co-clustering of experiments and categories. The transformation of the FDR values again draws the eye to the important categories in the CIM representation. Categories are clustered together if their sets of changed and unchanged genes are similar.
    • Tab-delimited text files in gene category export (tvt) format that compares the total file against itself instead of against the changed file [TOTALGENESFILE.total.tvt]. Each row contains one pair of GO category and gene, for each gene in the total file with a mapping.

  2. A subdirectory [CHANGEDGENESFILE.dir] corresponding to each of the changed gene files (for each file type mentioned, there is one for the underexpressed, overexpressed, and changed genes if the changed gene file was of the two column format):
  3. Subdirectories for integration of all changed gene files [CHANGEALL.dir] Allows the user to determine the specific genes that are in the categories of interest, integrated across all of the changed-gene files, without needing to go back and search through a large number of individual gce files. Also permits analysis of this relationship by clustering.

Debug Information

If you used the debug option, then additional information about the execution of High-Throughput GoMiner is retained. The contents of the DEBUG file depend on the DEBUG parameter in the config file. The most reasonable setting is 2. If the setting is 1, then there is an excess of information that is probably not too useful. Only the most confident user should set 0, as then error conditions might go unnoticed.

The first thing to do after completion of a High-Throughput GoMiner run is to look a the bottom of the DEBUG file. If it says "NORMAL TERMINATION", you can be pretty sure everything went well. You might want to scroll through the whole file quickly and look for things that went to STDERR but that did not generate a problem that was picked up by the shell.

On the other hand, if there was a detectable problem, then the end of the DEBUG file is where that will be reported, since everything stops when a problem is detected. Usually, the error will be reported with an error code. You can work your way backwards (or upwards) to get some hints of where the problem specifically occurred, and either correct your dataset or config parameters, or send us a bug report.

GoMiner™ is a development of the Genomics and Pharmacology Facility, Developmental Therapeutics Branch (DTB), Center for Cancer Research (CCR), National Cancer Institute (NCI).

We would like to hear from you. You can reach the team via email.

Notice and Disclaimer