PredPower
A tool for assesing the prediction power of networks
by PINAR PIR
Bogazici University (2005)
---------------------------------------------------------

PredPower is developed to evaluate the prediction power of a network on another network provided that a set of relations is known between the two networks (Pir, P., 2005, "Integrated Analysis of Metabolome Profiles and Gene Expression in Respiratory Deficient Deletion Mutants of Saccharomyces cerevisiae", PhD Thesis, Bogazici University Istanbul).

PredPower uses a group of Matlab programs under a main program "predpower". Several input files are needed to be supplied prior to running PredPower. Upon completion, several output files are saved by the program. A list of these files are given below:

INPUT FILES
-----------
data1 (not needed if network1 is supplied)
network1 (not needed if data1 is supplied)
network2
nodes1
nodes2
relation
genenames.txt (needed only if the results are to be displayed in text form)
termnames.txt (needed only if the results are to be displayed in text form)

CODE FILES
-----------
correlator.m
finalizer.mm
neighborer.m
networker.m  
predpower.m
proleter.m
resulter.m
toplagel.m


SAVED FILES
-----------
annotnopro (number of genes annotated to each gene)
network3 (new annotations for known and unknon genes)
pp (p-values for new annotations)
realt (real terms annotated to genes - from SGD)
relation2 (relations of nodes1 to nodes2)
resultspp (statistics on success of the annotations)
resultsnr (novel annotations for unknowns)
resultsnr5.mat (novel annotations in text form)


RUNNING PREDPOWER
-----------------

PredPower was developed in Matlab6.5 environment. All code and input files should be saved in the working directory. 
The command to run PredPower is:

>> predpower

Comments on the format of the input files appear on the screen as the program proceeds. 
The screen should appear as follows for the sample input files provided. 
Please contact Pinar Pir for comments/questions: pinarpir@boun.edu.tr


>> predpower

***********************************************************************************************************
 ******************************************** PREDPOWER ****************************************************
 ***********************************************************************************************************
 
 ***********************************************************************************************************
                    PredPower asseses the prediction power of Network1 on Network 2
 ***********************************************************************************************************
 
 ********************************************** NODES ******************************************************
   All nodes in networks should be represented by numbers, they should be provided as "lists of nodes"
   Using the SGD numbers of genes and GO terms may be a good idea,
   however, any arbitrary numbering is also accepted
 ***********************************************************************************************************
 
 ********************************************** EDGES ******************************************************
   All the edges should be directed, if they are not directed,
   every edge should be provided as two directed edges in opposite directions
   Example: A <-> B should be converted to A -> B and B -> A
 ***********************************************************************************************************
 
 ********************************************* NETWORK 1 ****************************************************
   Network 1 may be contructed among genes using correlation of gene expression data  
   Also a network may be provided by the user  
   Please enter 1 or 2:
        "I have the gene expression data and I want to make a network of correlated genes" (1)
        "I do not use expression data, I have a file with all connections in Network 1" (2)
1/2  ?2
   Please make sure that Network1 is saved with file name "network1"
   The file should be composed of two columns, every row representing an edge between two genes
 
 ********************************************* NETWORK 2 ****************************************************
   Two files are needed: "nodes2.m" and "network2.m"
   Please make sure that list of nodes is in file "nodes2.m" as a column.
   The Network2 file should be composed of two columns, every row representing an edge between two terms
 
 ********************************************* RELATION ****************************************************
   The relation among the nodes of Network1 and Network2 should be in file "relation.m"
   Please make sure that "relation.m" is composed of two columns.
   The nodes from Network1 should be in first column and
   related nodes from Network2 should be in second column.
 
 ********************************************* NEIGHBORS ****************************************************
   Please enter the level of neigbors to be included in the analysis
   0: No neigbors are included
   1,2, ... : Level of neighbors to be included
   0, 1, 2, ...   ?0
   ............. Please Wait ....... Searching the neighbors...................................
   ............. Please Wait ....... Making new annotations ...................................
   ............. Please Wait ....... Still making new annotations ................................
 ********************************************* RESULTS ****************************************************
The results regarded to prediction power of the networks are saved in file "resultspp" 
P: P-value treshold of the predicted terms" 
N1: Number of nodes with predicted relations (genes with predicted terms)" 
N2: Number of actual relations of nodes with predicted relations
(number of terms actually annotated to genes with predicted functions)
N3: Total number of predicted relations (total number of new predictions)
N4: Number of correctly predicted relations (number of correct new predictions)
N5: Number of nodes with at least one correctly predicted relation (number of genes with at least one correct prediction
P       N1            N2            N3           N4            N5
resultspp =
  1.0e+003 *
  Columns 1 through 3 
   0.00000000000010   0.06000000000000   0.07800000000000
   0.00000000000100   0.08100000000000   0.10500000000000
   0.00000000001000   0.09300000000000   0.11800000000000
   0.00000000010000   0.10600000000000   0.13300000000000
   0.00000000100000   0.13200000000000   0.16500000000000
   0.00000001000000   0.15000000000000   0.18800000000000
   0.00000010000000   0.17600000000000   0.21600000000000
   0.00000050000000   0.20400000000000   0.24500000000000
   0.00000100000000   0.23000000000000   0.27300000000000
   0.00000500000000   0.38200000000000   0.43700000000000
   0.00001000000000   0.41200000000000   0.46800000000000
   0.00005000000000   0.43400000000000   0.49200000000000
   0.00010000000000   0.43500000000000   0.49300000000000
   0.00020000000000   0.44200000000000   0.50000000000000
   0.00030000000000   0.44500000000000   0.50300000000000
   0.00040000000000   0.44500000000000   0.50300000000000
   0.00050000000000   0.44700000000000   0.50500000000000
   0.00060000000000   0.44700000000000   0.50500000000000
   0.00070000000000   0.44700000000000   0.50500000000000
   0.00080000000000   0.45200000000000   0.51000000000000
   0.00090000000000   0.45200000000000   0.51000000000000
   0.00095000000000   0.45200000000000   0.51000000000000
   0.00099000000000   0.45200000000000   0.51000000000000
  Columns 4 through 6 
   0.08800000000000   0.03700000000000   0.03200000000000
   0.12000000000000   0.04700000000000   0.04200000000000
   0.15400000000000   0.05500000000000   0.04800000000000
   0.20000000000000   0.06300000000000   0.05300000000000
   0.27500000000000   0.07400000000000   0.06400000000000
   0.31900000000000   0.08200000000000   0.07000000000000
   0.43700000000000   0.09300000000000   0.07900000000000
   0.57800000000000   0.10500000000000   0.09000000000000
   0.66600000000000   0.10600000000000   0.09100000000000
   1.27500000000000   0.13700000000000   0.11700000000000
   1.79100000000000   0.14500000000000   0.12300000000000
   3.48500000000000   0.16500000000000   0.14000000000000
   4.36000000000000   0.17300000000000   0.14700000000000
   5.22300000000000   0.18100000000000   0.15400000000000
   5.62100000000000   0.18800000000000   0.16100000000000
   5.82600000000000   0.19200000000000   0.16500000000000
   5.99900000000000   0.19200000000000   0.16500000000000
   6.12600000000000   0.19300000000000   0.16600000000000
   6.19400000000000   0.19400000000000   0.16700000000000
   6.28000000000000   0.19500000000000   0.16800000000000
   6.36300000000000   0.19500000000000   0.16800000000000
   6.43200000000000   0.19600000000000   0.16900000000000
   6.51700000000000   0.19600000000000   0.16900000000000
The results regarded to new relations made in this run are saved in file "resultsnr" 
Please select the p-value treshold to view the new relations made
1 x 10^-10, 1 x 10^-9, ..., 0.99  ?0.0000001
 The results are given in SGD numbers for genes and terms
 Optionally, two files may be provided to convert numbers to genes and terms
 Gene names in file genenames.m as a column, term names in file termnames.m as a column
 Every row of these files should correspond to numbers given in nodes1.m and nodes2.m
 Please note that the converted files are saved as cell array, they can be displayed by loading in Matlab only
Convert/Do not Convert : 1/2  ?1
   ............. Please Wait ....... Evaluating new annotations ...................................
resultsnr3 =
  1.0e+004 *
   0.00090000000000   0.43860000000000   0.00000000000012
   0.00170000000000   0.80940000000000   0.00000000000015
   0.00870000000000   0.55370000000000   0.00000000000019
   0.00880000000000   0.55370000000000   0.00000000000000
   0.01270000000000   0.48880000000000   0.00000000000108
   0.01620000000000   0.37040000000000   0.00000000000000
   0.01620000000000   0.37000000000000   0.00000000000025
   0.01820000000000   0.43860000000000   0.00000000000002
   0.01820000000000   0.36780000000000   0.00000000000003
   0.01820000000000   0.48880000000000   0.00000000000141
   0.01980000000000   0.53530000000000   0.00000000000004
   0.01980000000000   1.55780000000000   0.00000000000004
   0.01980000000000   0.53550000000000   0.00000000000008
   0.02180000000000   3.05080000000000   0.00000000000004
   0.02370000000000   1.65630000000000   0.00000000000000
   0.02370000000000   0.37040000000000   0.00000000000040
   0.02370000000000   0.36770000000000   0.00000000000059
   0.02500000000000   0.40220000000000   0.00000000000000
   0.02660000000000   0.48420000000000   0.00000000000213
   0.03540000000000   0.37040000000000   0.00000000000000
   0.03540000000000   1.65630000000000   0.00000000000001
   0.03540000000000   0.36770000000000   0.00000000000033
   0.03730000000000   5.10820000000000   0.00000000000002
   0.03900000000000   1.68870000000000   0.00000000000000
   0.03900000000000   0.41750000000000   0.00000000000302
   0.04450000000000   0.53550000000000   0.00000000000000
   0.04450000000000   0.53530000000000   0.00000000000000
   0.04450000000000   1.55780000000000   0.00000000000000
   0.04450000000000   0.53540000000000   0.00000000000000
   0.04740000000000   0.43860000000000   0.00000000000002
   0.04740000000000   0.36780000000000   0.00000000000003
    'SWC3'
  may be involved in:
    'helicase_activity'
    'FUN30'
  may be involved in:
    'DNA-dependent_ATPase_activity'
    'YAR061W'
  may be involved in:
    'mannose_binding'
    'YAR062W'
  may be involved in:
    'mannose_binding'
    'SHE1'
  may be involved in:
    'transmembrane_receptor_activity'
    'SEF1'
  may be involved in:
    'specific_RNA_polymerase_II_transcription_factor_activity'
    'SEF1'
  may be involved in:
    'transcription_factor_activity'
    'YBL086C'
  may be involved in:
    'helicase_activity'
    'YBL086C'
  may be involved in:
    'DNA_helicase_activity'
    'YBL086C'
  may be involved in:
    'transmembrane_receptor_activity'
    'SFT2'
  may be involved in:
    'fructose_transporter_activity'
    'SFT2'
  may be involved in:
    'mannose_transporter_activity'
    'SFT2'
  may be involved in:
    'glucose_transporter_activity'
    'YBR014C'
  may be involved in:
    'thiol-disulfide_exchange_intermediate_activity'
    'EDS1'
  may be involved in:
    'transcriptional_activator_activity'
    'EDS1'
  may be involved in:
    'specific_RNA_polymerase_II_transcription_factor_activity'
    'EDS1'
  may be involved in:
    'DNA_binding'
    'ZTA1'
  may be involved in:
    'alcohol_dehydrogenase_activity'
    'YBR062C'
  may be involved in:
    'ubiquitin-protein_ligase_activity'
    'TBS1'
  may be involved in:
    'specific_RNA_polymerase_II_transcription_factor_activity'
    'TBS1'
  may be involved in:
    'transcriptional_activator_activity'
    'TBS1'
  may be involved in:
    'DNA_binding'
    'SSE2'
  may be involved in:
    'unfolded_protein_binding'
    'PCH2'
  may be involved in:
    'ATPase_activity'
    'PCH2'
  may be involved in:
    'endopeptidase_activity'
    'YBR241C'
  may be involved in:
    'glucose_transporter_activity'
    'YBR241C'
  may be involved in:
    'fructose_transporter_activity'
    'YBR241C'
  may be involved in:
    'mannose_transporter_activity'
    'YBR241C'
  may be involved in:
    'galactose_transporter_activity'
    'BIT2'
  may be involved in:
    'helicase_activity'
    'BIT2'
  may be involved in:
    'DNA_helicase_activity'


 