summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorChristoph Helma <helma@in-silico.ch>2020-10-17 21:33:07 +0200
committerChristoph Helma <helma@in-silico.ch>2020-10-17 21:33:07 +0200
commit4b6c7b29b9e59ad9d80f0428b9859b91c6b1d9f1 (patch)
tree884b32deff2a0e7e74f31aff13a1c34ac7650fbc
parentf1253d41778393228c78fbd86ddc94074255f445 (diff)
typos fixed
-rw-r--r--mutagenicity.md12
-rw-r--r--mutagenicity.pdfbin1833942 -> 1833299 bytes
-rwxr-xr-xscripts/pa-table.rb2
-rw-r--r--tables/pa-tab.tex2
4 files changed, 8 insertions, 8 deletions
diff --git a/mutagenicity.md b/mutagenicity.md
index a102afa..9efecfd 100644
--- a/mutagenicity.md
+++ b/mutagenicity.md
@@ -497,7 +497,7 @@ Crossvalidation results are summarized in the following tables: @tbl:lazar shows
Confusion matrices for all models are available from the git repository http://git.in-silico.ch/mutagenicity-paper/10-fold-crossvalidations/confusion-matrices/, individual predictions can be found in
http://git.in-silico.ch/mutagenicity-paper/10-fold-crossvalidations/predictions/.
-The most accurate crossvalidation predictions have been obtained with standard `lazar` models using MolPrint2D descriptors ({{lazar-high-confidence.acc}} for predictions with high confidence, {{lazar-all.acc}} for all predictions). Models utilizing PaDEL descriptors have generally lower accuracies ranging from {{R-DL}} (R deep learning) to {{R-RF}} (R/Tensorflow random forests). Sensitivity and specificity is generally well balanced with the exception of `lazar`-PaDEL (low sensitivity) and R deep learning (low specificity) models.
+The most accurate crossvalidation predictions have been obtained with standard `lazar` models using MolPrint2D descriptors ({{lazar-high-confidence.acc}} for predictions with high confidence, {{lazar-all.acc}} for all predictions). Models utilizing PaDEL descriptors have generally lower accuracies ranging from {{R-DL.acc}} (R deep learning) to {{R-RF.acc}} (R/Tensorflow random forests). Sensitivity and specificity is generally well balanced with the exception of `lazar`-PaDEL (low sensitivity) and R deep learning (low specificity) models.
Pyrrolizidine alkaloid mutagenicity predictions
-----------------------------------------------
@@ -536,7 +536,7 @@ downloaded from
Model performance
-----------------
-@tbl:summary and @fig:roc show that the standard `lazar` algorithm (with MP2D
+@tbl:lazar, @tbl:R, @tbl:tensorflow and @fig:roc show that the standard `lazar` algorithm (with MP2D
fingerprints) give the most accurate crossvalidation results. R Random Forests,
Support Vector Machines and Tensorflow models have similar accuracies with
balanced sensitivity (true position rate) and specificity (true negative rate).
@@ -553,8 +553,8 @@ analysis of `lazar` lowest observed effect level predictions, which are also
similar to the experimental variability (@Helma2018).
The lowest number of predictions ({{lazar-padel-high-confidence.n}}) has been
-obtained from `lazar`/PaDEL high confidence predictions, the largest number of
-predictions comes from Tensorflow models ({{tensorflow-all.n}}). Standard
+obtained from `lazar`-PaDEL high confidence predictions, the largest number of
+predictions comes from Tensorflow models ({{tensorflow-rf.n}}). Standard
`lazar` give a slightly lower number of predictions ({{lazar-all.n}}) than R
and Tensorflow models. This is not necessarily a disadvantage, because `lazar`
abstains from predictions, if the query compound is very dissimilar from the
@@ -712,7 +712,7 @@ corresponding tertiary PAs. However, in the groups of modification of
the necine base, dehydropyrrolizidine, the toxic principle of PAs,
should have had the highest genotoxic potential. Taken together, the
predictions of the modifications of the necine base from the LAZAR, RF
-and R-generated DL model cannot -- in contrast to the Tensorflow DL
+and R-generated DL model cannot - in contrast to the Tensorflow DL
model - be considered as reliable.
Overall, when comparing the prediction results of the PAs to current
@@ -753,7 +753,7 @@ A new public *Salmonella* mutagenicity training dataset with 8309 compounds was
created and used it to train `lazar`, R and Tensorflow models with MolPrint2D
and PaDEL descriptors. The best performance was obtained with `lazar` models
using MolPrint2D descriptors, with prediction accuracies
-({{lazar.-high-confidence.acc}}) comparable to the interlaboratory variability
+({{lazar-high-confidence.acc}}) comparable to the interlaboratory variability
of the Ames test (80-85%). Models based on PaDEL descriptors had lower
accuracies than MolPrint2D models, but only the `lazar` algorithm could use
MolPrint2D descriptors.
diff --git a/mutagenicity.pdf b/mutagenicity.pdf
index a26a28e..7c04ef7 100644
--- a/mutagenicity.pdf
+++ b/mutagenicity.pdf
Binary files differ
diff --git a/scripts/pa-table.rb b/scripts/pa-table.rb
index af3ab21..032aaa1 100755
--- a/scripts/pa-table.rb
+++ b/scripts/pa-table.rb
@@ -8,7 +8,7 @@ puts '
\definecolor{grey}{rgb}{0.5,0.5,0.5}
\tiny
\begin{longtable}{rcccccc}
-\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red|green: low confidence} \\\\
+\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red/green: low confidence} \\\\
\label{tab:pa}
& & \multicolumn{2}{c}{lazar} & \multicolumn{3}{c}{R} \\\\
'
diff --git a/tables/pa-tab.tex b/tables/pa-tab.tex
index 15e02a7..22b5bfe 100644
--- a/tables/pa-tab.tex
+++ b/tables/pa-tab.tex
@@ -6,7 +6,7 @@
\definecolor{grey}{rgb}{0.5,0.5,0.5}
\tiny
\begin{longtable}{rcccccc}
-\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red|green: low confidence} \\
+\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red/green: low confidence} \\
\label{tab:pa}
& & \multicolumn{2}{c}{lazar} & \multicolumn{3}{c}{R} \\
ID & Measured & MP2D & PaDEL & DL & RF & SVM \\