GoMiner™ Command Line Tool

System Requirements

Command Line Instructions

Parameter Flag Expected Values Example (Windows Format)
Total Gene File -t Absolute or relative path to file containing the list of all of the genes, typically all of the genes on your microarray. If you have no total file, you can type GENERATE for the system to generate a total file (please be sure to specify an organism and preferably a data source) -t C:\MyFiles\AllMyGenes.txt or GENERATE
Changed-genes file -c (or) -h -c is used to specify a single changed-gene file. Changed-gene file should contain lists of all of the changed genes to be analyzed, typically from a microarray experiment
-h is used to specify an input file containing list of changed-gene files
-c C:\MyFiles\ChangedGenesExp1.txt or -h C:\MyFiles\ChangedGenesFileList.txt

Note:"ChangedGenesFileList.txt" contains list of changed-gene files.

C:\MyFiles\ChangedGenesExp1.txt
C:\MyFiles\ChangedGenesExp2.txt
C:\MyFiles\ChangedGenesExp3.txt
Database -d JDBC URL. This information is either the connection information for the discover server, or from your own database, if you have configured a local database -d jdbc:mysql://discover.nci.nih.gov:1521/GEEVS
JDBC Driver -j Java class name for the JDBC driver class for the database you are accessing. This information is either the connection information for the discover server, or from your own database, if you have configured a local database. The two supported drivers are:
  • com.mysql.jdbc.Driver
  • org.apache.derby.jdbc.EmbeddedDriver
-j com.mysql.jdbc.Driver
Results Directory -r A relative or absolute path to a directory into which the results will be placed. If the directory does not already exist, one will be created. -r C:\MyFiles\MyResults
Data Source -s Which data source(s) should be used for making GO associations. The available options are either all or a semi-colon delimited list of specific data source(s).
NOTE: On some Unix shells, it is necessary to escape the semi-colon with a backslash, i.e. use \; instead of ; as the delimiter.
The available choices are:
  • CGEN (H. sapiens et al.)
  • SPTR (H. sapiens et al.)
  • UniProtKB (H. sapiens et al.)
  • TIGR_TGI (H. sapiens et al.)
  • TIGR_CMR (Microbes)
  • FB (D. melanogaster)
  • GeneDB (G. morsitans)
  • GeneDB_Tbrucei (T. brucei)
  • GR (GR)
  • MGI (M. musculus)
  • RGD (R. norvegicus)
  • SGD (S. cerevisiae)
  • TAIR (A. thaliana)
  • TIGR_Ath1 (A. thaliana)
  • TIGR_Tba1 (T. brucei)
  • WB (C. elegans)
  • ZFIN (D. rerio)
-s all or -s MGI or -s MGI;RGD
Organism -o Which organism(s) data should be used for making GO associations. The available options are either all or a semi-colon delimited list of NCBI taxonomy id's.
NOTE: On some Unix shells, it is necessary to escape the semi-colon with a backslash, i.e. use \; instead of ; as the delimiter.
-o all or -o 602 or -o 602;10090
Export Type -e Which export file format(s) should be generated. The user many specify one or more export format types in a semi-colon separated list. The available options are [se][gce][svge][vsvge][fdrse]. The se option creates a summary for each category. The gce option lists all of the genes in each category. The svge option creates a DAG (Directed Acyclic Graph). The vsvge option creates a VennMaster view of changed category. The fdrse option creates a summary for each category including FDR values. -e se
Root Category -a Which category should be used as root for the export. The available options are either all or GO ID. The popular root categories include:
  • biological_process: GO:0008150
  • cellular_component: GO:0005575
  • molecular_function: GO:0003674
-a all or -a GO:0008150 or -a GO:0005575 or -a GO:0003674
Minimum Category Size for Selecting Randomized Categories -m Limits the categories that will be included in the CIM and summary reports. Categories whose size is less than this threshold will be omitted from category statistic calculations, and randomized categories below this threshold will be omitted from FDR calculations. -m 5 or -m 1
Evidence Code -v Which sets of Evidence Codes to use as the criteria for accepting gene to GO category mappings. Based upon our own experience, we recommend level 3 for general use.

Level TAS IDA IMP IGI IPI ISS IEP NAS RCA IEA NR ND
all x x x x x x x x x x x x
1 x x x x x x x x x
2 x x x x x x x x
3 x x x x x x x
4 x x x x x
5 x x

-v all or -v LEVEL3 or -v TAS;IDA
Use Cross Reference Table -x Specifies how user submitted gene identifiers should be looked up from the database.
The available choices are
  • [true] : Includes the GO Consortium data from GO's cross reference table
  • [false] : Do not use the cross reference table
  • -x true
    Use Synonym Table -y Specifies how user submitted gene identifiers should be looked up from the database.
    The available choices are
  • [true] : Includes the GO Consortium data from GO's gene product synonym table
  • [false] : Do not use the gene product synonym table
  • -y true
    Number of Randoms -f Number of randoms to be used in FDR calculation.
    This option is required only if Export Type is fdrse
    -f 25

    Complete Example for Windows:

    java -cp gominer.jar gov.nih.nci.lmp.gominer.GOCommand -t C:\gominer\total.gene.txt -c C:\gominer\changed1 -d jdbc:mysql://discover.nci.nih.gov:1521/GEEVS -j com.mysql.jdbc.Driver -r C:\dumptest -s UniProtKB;MGI -o all -e gce;svge -a all -m 5 -v all -x false -y false -f 25

    Complete Example for Unix:

    java -cp gominer.jar gov.nih.nci.lmp.gominer.GOCommand -t /usr/home/gominer/total.gene.txt -c /usr/home/gominer/changed1 -d jdbc:mysql://discover.nci.nih.gov:1521/GEEVS -j com.mysql.jdbc.Driver -r /usr/home/gominer/dumptest -s UniProtKB\;MGI -o all -e gce\;svge -a all -m 5 -v all -x false -y false -f 25

    Notes

    Limitations

    Using the GoMiner Command Line Interface for High-Throughput Processing

    The command line interface can be used to include other components as part of high-throughput processing. A typical processing scenario could include the following steps in an automated script:

    1. A statistical operation identifies a set of changed genes
    2. The set of genes if formatted into the GoMiner changed-gene file format
    3. GoMiner processes the files using the command line interface
    4. The results exported from GoMiner are analyzed
    5. The user is notified about which changed-gene files are of interest

    After you have identified results of particular interest, you can use the GoMiner GUI tool to examine them.


    GoMiner™ is a development of the Genomics and Pharmacology Facility, Developmental Therapeutics Branch (DTB), Center for Cancer Research (CCR), National Cancer Institute (NCI).

    We would like to hear from you. You can reach the team via email.

    Notice and Disclaimer