You are here: Holliday Laboratory » People » Haktan Suren » iPlant Ontologizer Tutorial

iPlant Ontologizer Tutorial

Step 1: Pre-Processing the Expression Data File for Ontologizer

1.     Download “Generate Dataset for Ontologizer” perl script from here. You can download the sample data from here

2.     Run generate_dataset.pl

NOTE: Before running script, make sure there is no file in the same directory whose name same as the output of the program (e.g. pop_study.txt, Q.PR2.PR1.txt, etc.).

NOTE: Make sure you have necessary modules installed. Getopt::Euclid.

OPTIONS

-m : column number of the model names

-ex : expression file path

-s : column number of study sets (q-values)

-cut : q value cut off

-e : evalue cut off for model names

-ec : column number of e-value

NOTE: Column number starts counting from 0. So the first column will be zero (0).

For the sample data files: 23,24,25 and 26 are the column number for PR study set. For model name, we will use column 1 (Arabidopsis model names) and corresponding e-value for Arabidopsis orthologue is on second column.

perl generate_dataset.pl -m 1 -ex Full_Dataset.txt -s 23 -s 24 -s 25 -s 26 -cut 0.1 -e 0.0001 -ec 2

For full dataset:

perl generate_dataset.pl -m 1 -ex Full_Dataset.txt -s 23 -s 24 -s 25 -s 26 -s 27 -s28 -s 29 -s 30 -s 31 -s 32 -s 33 -s 34 -s35 -s 36 -s 37 -s 38 -s 39 -s 40 -cut 0.1 -e 0.0001 -ec 2

3.     The script will create necessary inputs for Ontologizer.

Step 2: Running Ontologizer

1.     Run Ontologizer.jar (in Application folder) with following options.

OPTIONS

-m : MTC method (e.g. Westfall-Young-Single-Step)

-c : Calculation method (e.g. Parent-Child-Union)

-a : Association file from genes to GO terms

-g : gene ontology file

-o : Output directory

-p : population set that is produced by generate_dataset.pl script

-r : number of steps used in resampling based MTCs.

-s : study sets that are produced by generate_dataset.pl script

java -jar Ontologizer.jar -a gene_association.tair -g gene_ontology_ext.obo -s Input_v1 -p pop_study.txt -c Parent-Child-Union -m Westfall-Young-Single-Step -r 1000 -o Output_v1

Click here for more details

Step 3: Analyzing Ontologizer Output

1.     Download “iPlant Heatmap” perl script from here.

2.     Run iplant_heatmap perl

NOTE: Before running script, make sure there is no file in the same directory whose name same as the output of the program (e.g. Output.txt).

NOTE: Make sure you have necessary modules installed. Getopt::Euclid, Class::Struct

OPTIONS

-onto : the output files of the Ontologizer.

-cut : the cutoff value for visualizing the output in heatmap.

perl iplant_heatmap.pl -onto table-Q.PR2.PR1-Parent-Child-Union-Westfall-Young-Single-Step.txt -onto table-Q.PR3.PR1-Parent-Child-Union-Westfall-Young-Single-Step.txt -onto table-Q.PR4.PR1-Parent-Child-Union-Westfall-Young-Single-Step.txt -onto table-Q.PR5.PR1-Parent-Child-Union-Westfall-Young-Single-Step.txt -cut 0.95