summaryrefslogtreecommitdiff
path: root/README.md
blob: ea25a8f4c8057168a7798d49dbaf3380ad314a22 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
OpenTox Algorithm
=================

- An [OpenTox](http://www.opentox.org) REST Webservice
- Implements the OpenTox algorithm API for
    - lazar
    - subgraph descriptor calculation (fminer)
    - physico-chemical descriptor calculation (pc) for more than 300 descriptors
    - feature selection (fs) using recursive feature elimination (rfe)

REST operations
---------------

DESCRIPTION                  REST  ADDRESS           ARGUMENTS                      RETURN                    CODES

Get a representation of the  GET   /lazar            -                              lazar representation      200,404
lazar algorithm

Get a list of all algorithms GET   /                 -                              URIs of algorithms        200
Get a representation of the  GET   /fminer/          -                              fminer representation     200,404
fminer algorithms
Get a representation of the  GET   /fminer/bbrc      -                              bbrc representation       200,404
bbrc algorithm
Get a representation of the  GET   /fminer/last      -                              last representation       200,404
last algorithm

Get a representation of the  GET   /pc               -                              URIs of algorithms        200,404
pc algorithms
Get a representation of the  GET   /pc/<name>        -                              descriptor representation 200,404
pc algorithm <name>

Get a representation of the  GET   /fs               -                              fs representation         200,404
fs algorithms
Get a representation of the  GET   /fs/rfe           -                              rfe representation        200,404
rfe algorithm

Create lazar model           POST  /lazar            dataset_uri,                   URI for lazar model       200,400,404,500
                                                     [prediction_feature],
                                                     [feature_generation_uri],
                                                     [feature_dataset_uri],
                                                     [prediction_algorithm],
                                                     [pc_type=null],
                                                     [lib=null],
                                                     [nr_hits=false (cl.wmv), 
                                                       true (else)],
                                                     [min_sim=0.3 (nominal), 0.4 
                                                       (numeric features)],
                                                     [min_train_performance=0.1]

Create bbrc features         POST  /fminer/bbrc      dataset_uri,                   URI for feature dataset   200,400,404,500
                                                     prediction_feature,
                                                     [min_frequency=5 per-mil],
                                                     [feature_type=trees],
                                                     [backbone=true],
                                                     [min_chisq_significance=0.95],
                                                     [nr_hits=false]
Create last features         POST  /fminer/last      dataset_uri,                   URI for feature dataset   200,400,404,500
                                                     prediction_feature,
                                                     [min_frequency=8 %],
                                                     [feature_type=trees],
                                                     [nr_hits=false]

Create features              POST /pc/AllDescriptors dataset_uri,                   URI for dataset           200,400,404,500
                                                     [pc_type=constitutional,
                                                     topological,geometrical,
                                                     electronic,cpsa,hybrid],
                                                     [lib=cdk,joelib,openbabel]

Create feature               POST /pc/<name>         dataset_uri                    URI for dataset           200,400,404,500

Select features              POST /fs/rfe            dataset_uri,                   URI for dataset           200,400,404,500
                                                     prediction_feature,
                                                     feature_dataset_uri,
                                                     [del_missing=false]
Synopsis
--------

- prediction\_algorithm: One of "weighted\_majority\_vote" (default for classification),  "local\_svm\_classification", "local\_svm\_regression" (default for regression). "weighted\_majority\_vote"  is not applicable for regression.
- pc_type: Mandatory for feature datasets that do not contain appropriate feature metadata, one of [geometrical, topological, electronic, constitutional, hybrid, cpsa].
- lib: Mandatory for feature datasets that do not contain appropriate feature metadata, one of [cdk, openbabel, joelib].
- nr_hits: Whether nominal features should be instantiated with their occurrence counts in the instances. One of "true", "false". 
- min_sim: The minimum similarity threshold for neighbors. Numeric value in [0,1].
- min_train_performance. The minimum training performance for "local\_svm\_classification" (Accuracy) and "local\_svm\_regression" (R-squared). Numeric value in [0,1].
- del_missing: one of true, false

See http://www.maunz.de/wordpress/opentox/2011/lazar-models-and-how-to-trigger-them for a graphical overview.


Supported MIME formats
----------------------

- application/rdf+xml (default): read/write OWL-DL
- application/x-yaml: read/write YAML

Examples
--------

NOTE: http://webservices.in-silico.ch hosts the stable version that might not have complete functionality yet. **Please try http://ot-test.in-silico.ch** for latest versions.

### Get the OWL-DL representation of fminer

    curl http://webservices.in-silico.ch/algorithm/fminer

### Get the OWL-DL representation of lazar

    curl http://webservices.in-silico.ch/algorithm/lazar

* * * 

The following creates datasets with backbone refinement class representatives or latent structure patterns, using supervised graph mining, see http://cs.maunz.de. These features can be used e.g. as structural alerts, as descriptors (fingerprints) for prediction models or for similarity calculations.

### Create the full set of frequent and significant subtrees

    curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d min_frequency={min_frequency} -d "backbone=false" http://webservices.in-silico.ch/algorithm/fminer/bbrc

feature_uri specifies the dependent variable from the dataset.
backbone=false reduces BBRC mining to frequent and correlated subtree mining (much more descriptors are produced).

### Create [BBRC](http://bbrc.maunz.de) features, recommended for large and very large datasets.

    curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d min_frequency={min_frequency} http://webservices.in-silico.ch/algorithm/fminer/bbrc

feature_uri specifies the dependent variable from the dataset.   
Adding -d nr_hits=true produces frequency counts per pattern and molecule.
Please click [here](http://bbrc.maunz.de#usage) for more guidance on usage.

### Create [LAST-PM](http://last-pm.maunz.de) descriptors, recommended for small to medium-sized datasets.

    curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d min_frequency={min_frequency} http://webservices.in-silico.ch/algorithm/fminer/last

feature_uri specifies the dependent variable from the dataset.   
Adding -d nr_hits=true produces frequency counts per pattern and molecule.
Please click [here](http://last-pm.maunz.de#usage) for guidance for more guidance on usage.

* * * 

### Create lazar model

Creates a standard Lazar model.

    curl -X POST -d dataset_uri={datset_uri} -d prediction_feature={feature_uri} -d feature_generation_uri=http://webservices.in-silico.ch/algorithm/fminer/bbrc http://webservices.in-silico.ch/test/algorithm/lazar 

[API documentation](http://rdoc.info/github/opentox/algorithm)
--------------------------------------------------------------

* * *

### Create a feature dataset of physico-chemical descriptors with CDK
    curl -X POST -d dataset_uri={dataset_uri} -d lib=cdk http://webservices.in-silico.ch/test/algorithm/pc/AllDescriptors

* * *

### Select features from a feature dataset
    curl -X POST -d dataset_uri={dataset_uri} -d prediction_feature_uri={prediction_feature_uri} -d feature_dataset_uri={feature_dataset_uri} http://webservices.in-silico.ch/test/algorithm/fs/rfe


Copyright (c) 2009-2011 Christoph Helma, Martin Guetlein, Micha Rautenberg, Andreas Maunz, David Vorgrimmler, Denis Gebele. See LICENSE for details.