lazar summary updated

author: Christoph Helma <helma@in-silico.ch> 2020-10-10 17:51:55 +0200
committer: Christoph Helma <helma@in-silico.ch> 2020-10-10 17:51:55 +0200
commit: d020e33ed153015cdf7288fecf1e99945736f750 (patch)
tree: c7bdcdb8495b3675537b0ad641989b3c5cc28b74
parent: 639eb4601666ccb25c5d3ac347c659837e1dbe43 (diff)
4 files changed, 21 insertions, 14 deletions
diff --git a/mutagenicity.md b/mutagenicity.md
index 9012ce5..5dbe124 100644
--- a/mutagenicity.md
+++ b/mutagenicity.md
@@ -474,10 +474,10 @@ Results
 10-fold crossvalidations
 ------------------------
 
-Crossvalidation results are summarized in the following tables: @tbl:lazar shows `lazar` results with MolPrint2D and PaDEL descriptors, @tbl:R summarizes R results and @tbl:tensorflow Tensorflow results.
+Crossvalidation results are summarized in the following tables: @tbl:lazar shows `lazar` results with MolPrint2D and PaDEL descriptors, @tbl:R R results and @tbl:tensorflow Tensorflow results.
 
 
-```{#tbl:lazar .table file="tables/lazar-summary.csv" caption="Summary of lazar crossvalidation results"}
+```{#tbl:lazar .table file="tables/lazar-summary.csv" caption="Summary of lazar crossvalidation results (all predictions/high confidence predictions"}
 ```
 
 ```{#tbl:R .table file="tables/r-summary.csv" caption="Summary of R crossvalidation results"}
@@ -488,6 +488,8 @@ Crossvalidation results are summarized in the following tables: @tbl:lazar shows
 
 @fig:roc depicts the position of all crossvalidation results in receiver operating characteristic (ROC) space.
 
+![ROC plot of crossvalidation results. *R-RF*: R Random Forests, *R-SVM*: R Support Vector Machines, *R-DL*: R Deep Learning, *TF*: Tensorflow without feature selection, *TF-FS*: Tensorflow with feature selection, *L*: lazar, *L-HC*: lazar high confidence predictions, *L-P*: lazar with PaDEL descriptors, *L-P-HC*: lazar PaDEL high confidence predictions (overlaps with L-P)](figures/roc.png){#fig:roc}
+
 Confusion matrices for all models are available from the git repository http://git.in-silico.ch/mutagenicity-paper/10-fold-crossvalidations/confusion-matrices/, individual predictions can be found in 
 http://git.in-silico.ch/mutagenicity-paper/10-fold-crossvalidations/predictions/.
 
@@ -505,8 +507,6 @@ The most accurate crossvalidation predictions have been obtained with `lazar` mo
 
 : Summary of crossvalidation results. *R-RF*: R Random Forests, *R-SVM*: R Support Vector Machines, *R-DL*: R Deep Learning, *TF*: Tensorflow without feature selection, *TF-FS*: Tensorflow with feature selection, *L*: lazar, *L-HC*: lazar high confidence predictions, *L-P*: lazar with PaDEL descriptors, *L-P-HC*: lazar PaDEL high confidence predictions, *PPV*: Positive predictive value (Precision), *NPV*: Negative predictive value {#tbl:summary}
 
-![ROC plot of crossvalidation results. *R-RF*: R Random Forests, *R-SVM*: R Support Vector Machines, *R-DL*: R Deep Learning, *TF*: Tensorflow without feature selection, *TF-FS*: Tensorflow with feature selection, *L*: lazar, *L-HC*: lazar high confidence predictions, *L-P*: lazar with PaDEL descriptors, *L-P-HC*: lazar PaDEL high confidence predictions (overlaps with L-P)](figures/roc.png){#fig:roc}
-
 R Models
 --------
 
diff --git a/mutagenicity.pdf b/mutagenicity.pdf
index b4c9e57..d0acdc7 100644
--- a/mutagenicity.pdf
+++ b/mutagenicity.pdf
diff --git a/scripts/summaries2table.rb b/scripts/summaries2table.rb
index f98ec54..a3ce67e 100755
--- a/scripts/summaries2table.rb
+++ b/scripts/summaries2table.rb
@@ -12,12 +12,19 @@ when "tensorflow"
   header = ["RF","LR (SGD)","LR (SCIKIT)","NN"]
   keys = ["lr","lr2","nn"].collect{|n| "tensorflow-"+n+".v3"}
 when "lazar"
-  header = ["lazar-mp2d (all)","lazar-mp2d (high confidence)", "lazar-padel (all)","lazar-padel (high confidence)"]
-  keys = ["lazar-all","lazar-high-confidence", "lazar-padel-all","lazar-padel-high-confidence"]
+  header = ["MP2D", "PaDEL"]
+  mp2dkeys = ["lazar-all","lazar-high-confidence"]
+  padelkeys = ["lazar-padel-all","lazar-padel-high-confidence"]
+  puts ","+header.join(",")
+  rows.each do |short,long|
+    print long+","
+    print mp2dkeys.collect{|k| data[k][short]}.join("/")+","
+    puts padelkeys.collect{|k| data[k][short]}.join("/")
+  end
+  exit
 end
 puts ","+header.join(",")
 rows.each do |short,long|
   print long+","
   puts keys.collect{|k| data[k][short]}.join(",")
 end
-exit
diff --git a/tables/lazar-summary.csv b/tables/lazar-summary.csv
index a034ae0..3a0840e 100644
--- a/tables/lazar-summary.csv
+++ b/tables/lazar-summary.csv
@@ -1,7 +1,7 @@
-,lazar-mp2d (all),lazar-mp2d (high confidence),lazar-padel (all),lazar-padel (high confidence)
-Accuracy,0.82,0.84,0.58,0.58
-True positive rate/Sensitivity,0.85,0.89,0.32,0.32
-True negative rate/Specificity,0.78,0.79,0.79,0.79
-Positive predictive value/Precision,0.8,0.83,0.56,0.56
-Negative predictive value,0.84,0.85,0.59,0.59
-Nr. predictions,7781,5890,4089,4081
+,MP2D,PaDEL
+Accuracy,0.82/0.84,0.58/0.58
+True positive rate/Sensitivity,0.85/0.89,0.32/0.32
+True negative rate/Specificity,0.78/0.79,0.79/0.79
+Positive predictive value/Precision,0.8/0.83,0.56/0.56
+Negative predictive value,0.84/0.85,0.59/0.59
+Nr. predictions,7781/5890,4089/4081
author	Christoph Helma <helma@in-silico.ch>	2020-10-10 17:51:55 +0200
committer	Christoph Helma <helma@in-silico.ch>	2020-10-10 17:51:55 +0200
commit	d020e33ed153015cdf7288fecf1e99945736f750 (patch)
tree	c7bdcdb8495b3675537b0ad641989b3c5cc28b74
parent	639eb4601666ccb25c5d3ac347c659837e1dbe43 (diff)