Java Main function
[algorithm] / README.md
1 OpenTox Algorithm
2 =================
3
4 - An [OpenTox](http://www.opentox.org) REST Webservice
5 - Implements the OpenTox algorithm API for
6     - lazar
7     - subgraph descriptor calculation (fminer)
8     - physico-chemical descriptor calculation (pc) for more than 300 descriptors
9     - feature selection (fs) using recursive feature elimination (rfe)
10 - See [opentox-ruby on maunz.de](http://opentox-ruby.maunz.de) for high-level workflow documentation
11
12 REST operations
13 ---------------
14
15     DESCRIPTION                  TYPE  ADDRESS           ARGUMENTS                      RETURN TYPE               RETURN CODE
16     Get a representation of the  GET   /lazar            -                              lazar representation      200,404
17     lazar algorithm
18     Get a list of all algorithms GET   /                 -                              URIs of algorithms        200
19     Get a representation of the  GET   /fminer/          -                              fminer representation     200,404
20     fminer algorithms
21     Get a representation of the  GET   /fminer/bbrc      -                              bbrc representation       200,404
22     bbrc algorithm
23     Get a representation of the  GET   /fminer/last      -                              last representation       200,404
24     last algorithm
25     Get a representation of the  GET   /pc               -                              URIs of algorithms        200,404
26     pc algorithms
27     Get a representation of the  GET   /pc/<name>        -                              descriptor representation 200,404
28     pc algorithm <name>
29     Get a representation of the  GET   /fs               -                              URIs of algorithms        200,404
30     fs algorithms
31     Get a representation of the  GET   /fs/rfe           -                              rfe representation        200,404
32     rfe algorithm
33     Create lazar model           POST  /lazar            dataset_uri,                   URI for lazar model       200,400,404,500
34                                                          [prediction_feature],
35                                                          [feature_generation_uri],
36                                                          [feature_dataset_uri],
37                                                          [prediction_algorithm],
38                                                          [pc_type=null],
39                                                          [lib=null],
40                                                          [nr_hits=false (cl+wmv), 
41                                                            true (else)],
42                                                          [min_sim=0.3 (nominal), 0.4 
43                                                            (numeric features)],
44                                                          [min_train_performance=0.1]
45     Create bbrc features         POST  /fminer/bbrc      dataset_uri,                   URI for feature dataset   200,400,404,500
46                                                          prediction_feature,
47                                                          [min_frequency=5 per-mil],
48                                                          [feature_type=trees],
49                                                          [backbone=true],
50                                                          [min_chisq_significance=0.95],
51                                                          [nr_hits=false]
52     Create last features         POST  /fminer/last      dataset_uri,                   URI for feature dataset   200,400,404,500
53                                                          prediction_feature,
54                                                          [min_frequency=8 %],
55                                                          [feature_type=trees],
56                                                          [nr_hits=false]
57     Create features              POST /pc/AllDescriptors dataset_uri,                   URI for dataset           200,400,404,500
58                                                          [pc_type=constitutional,
59                                                          topological,geometrical,
60                                                          electronic,cpsa,hybrid],
61                                                          [lib=cdk,joelib,openbabel]
62     Create feature               POST /pc/<name>         dataset_uri                    URI for dataset           200,400,404,500
63     Select features              POST /fs/rfe            dataset_uri,                   URI for dataset           200,400,404,500
64                                                          prediction_feature,
65                                                          feature_dataset_uri,
66                                                          [del_missing=false]
67
68 Synopsis
69 --------
70
71 - *del_missing*: one of 
72     - *true*
73     - *false*
74
75 - *feature\_type*: Type of subgraphs when no feature dataset is supplied, one of
76     - *trees*
77     - *paths*
78
79 - *lib*: Mandatory for feature datasets that do not contain appropriate feature metadata, one of 
80     - *cdk*
81     - *openbabel*
82     - *joelib*
83
84 - *min_sim*: The minimum similarity threshold for neighbors. Numeric value in [0,1].
85
86 - *min_train_performance*. The minimum training performance for *local\_svm\_classification* (Accuracy) and *local\_svm\_regression* (R-squared). Numeric value in [0,1].
87
88 - *nr_hits*: Whether nominal features should be instantiated with their occurrence counts in the instances. One of 
89     - *true*
90     - *false*
91
92 - *pc_type*: Mandatory for feature datasets that do not contain appropriate feature metadata, one of 
93     - *geometrical*
94     - *topological* 
95     - *electronic*
96     - *constitutional*
97     - *hybrid*
98     - *cpsa*
99
100 - *prediction\_algorithm*: One of 
101     - *weighted\_majority\_vote* (default for classification, n.a. for regression)
102     - *local\_svm\_classification*
103     - *local\_svm\_regression* (default for regression). 
104
105
106 Supported MIME formats
107 ----------------------
108
109 - application/rdf+xml (default): read/write OWL-DL
110 - application/x-yaml: read/write YAML
111
112 Examples
113 --------
114
115 NOTE: http://webservices.in-silico.ch hosts the stable version that might not have complete functionality yet. **Please try http://ot-test.in-silico.ch** for latest versions.
116
117 ### Get the OWL-DL representation of lazar
118
119     curl http://webservices.in-silico.ch/algorithm/lazar
120
121 ### Get the OWL-DL representation of fminer
122
123     curl http://webservices.in-silico.ch/algorithm/fminer
124
125 ### Get the OWL-DL representation of pc
126
127     curl http://webservices.in-silico.ch/algorithm/pc
128
129 ### Get the OWL-DL representation of fs
130
131     curl http://webservices.in-silico.ch/algorithm/fs
132
133 * * * 
134
135 ### Create lazar model
136
137 Creates a standard Lazar model with subgraph descriptors.
138
139     curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d feature_generation_uri=http://webservices.in-silico.ch/algorithm/fminer/bbrc http://webservices.in-silico.ch/test/algorithm/lazar 
140
141 Creates a Lazar model with physico-chemical descriptors.
142
143     curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d feature_dataset_uri={feature_dataset_uri} http://webservices.in-silico.ch/test/algorithm/lazar 
144
145 feature_uri specifies the dependent variable from the dataset.
146
147 * * *
148
149 Creates subgraph descriptors with backbone refinement class representatives or latent structure patterns, using supervised graph mining, see http://cs.maunz.de. These features can be used e.g. as structural alerts, as descriptors (fingerprints) for prediction models or for similarity calculations.
150
151 ### Create the full set of frequent and significant subtrees
152
153     curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d min_frequency={min_frequency} -d "backbone=false" http://webservices.in-silico.ch/algorithm/fminer/bbrc
154
155 feature_uri specifies the dependent variable from the dataset.
156 backbone=false reduces BBRC mining to frequent and correlated subtree mining (much more descriptors are produced).
157
158 ### Create [BBRC](http://bbrc.maunz.de) features, recommended for large and very large datasets.
159
160     curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d min_frequency={min_frequency} http://webservices.in-silico.ch/algorithm/fminer/bbrc
161
162 feature_uri specifies the dependent variable from the dataset.   
163 Adding -d nr_hits=true produces frequency counts per pattern and molecule.
164 Click [here](http://bbrc.maunz.de#usage) for more guidance on usage.
165
166 ### Create [LAST-PM](http://last-pm.maunz.de) descriptors, recommended for small to medium-sized datasets.
167
168     curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d min_frequency={min_frequency} http://webservices.in-silico.ch/algorithm/fminer/last
169
170 feature_uri specifies the dependent variable from the dataset.   
171 Adding -d nr_hits=true produces frequency counts per pattern and molecule.
172 Click [here](http://last-pm.maunz.de#usage) for guidance for more guidance on usage.
173
174
175 * * * 
176
177 ### Create a feature dataset of physico-chemical descriptors with CDK
178
179     curl -X POST -d dataset_uri={dataset_uri} -d lib=cdk http://webservices.in-silico.ch/test/algorithm/pc/AllDescriptors
180
181 lib specifies the library to use.
182
183 * * *
184
185 ### Select features from a feature dataset
186
187     curl -X POST -d dataset_uri={dataset_uri} -d prediction_feature={feature_uri} -d feature_dataset_uri={feature_dataset_uri} http://webservices.in-silico.ch/test/algorithm/fs/rfe
188
189 feature_uri specifies the dependent variable from the dataset.   
190
191 * * *
192
193 Copyright (c) 2009-2011 Christoph Helma, Martin Guetlein, Micha Rautenberg, Andreas Maunz, David Vorgrimmler, Denis Gebele. See LICENSE for details.
194