From 4b6c7b29b9e59ad9d80f0428b9859b91c6b1d9f1 Mon Sep 17 00:00:00 2001 From: Christoph Helma Date: Sat, 17 Oct 2020 21:33:07 +0200 Subject: typos fixed --- mutagenicity.md | 12 ++++++------ mutagenicity.pdf | Bin 1833942 -> 1833299 bytes scripts/pa-table.rb | 2 +- tables/pa-tab.tex | 2 +- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/mutagenicity.md b/mutagenicity.md index a102afa..9efecfd 100644 --- a/mutagenicity.md +++ b/mutagenicity.md @@ -497,7 +497,7 @@ Crossvalidation results are summarized in the following tables: @tbl:lazar shows Confusion matrices for all models are available from the git repository http://git.in-silico.ch/mutagenicity-paper/10-fold-crossvalidations/confusion-matrices/, individual predictions can be found in http://git.in-silico.ch/mutagenicity-paper/10-fold-crossvalidations/predictions/. -The most accurate crossvalidation predictions have been obtained with standard `lazar` models using MolPrint2D descriptors ({{lazar-high-confidence.acc}} for predictions with high confidence, {{lazar-all.acc}} for all predictions). Models utilizing PaDEL descriptors have generally lower accuracies ranging from {{R-DL}} (R deep learning) to {{R-RF}} (R/Tensorflow random forests). Sensitivity and specificity is generally well balanced with the exception of `lazar`-PaDEL (low sensitivity) and R deep learning (low specificity) models. +The most accurate crossvalidation predictions have been obtained with standard `lazar` models using MolPrint2D descriptors ({{lazar-high-confidence.acc}} for predictions with high confidence, {{lazar-all.acc}} for all predictions). Models utilizing PaDEL descriptors have generally lower accuracies ranging from {{R-DL.acc}} (R deep learning) to {{R-RF.acc}} (R/Tensorflow random forests). Sensitivity and specificity is generally well balanced with the exception of `lazar`-PaDEL (low sensitivity) and R deep learning (low specificity) models. Pyrrolizidine alkaloid mutagenicity predictions ----------------------------------------------- @@ -536,7 +536,7 @@ downloaded from Model performance ----------------- -@tbl:summary and @fig:roc show that the standard `lazar` algorithm (with MP2D +@tbl:lazar, @tbl:R, @tbl:tensorflow and @fig:roc show that the standard `lazar` algorithm (with MP2D fingerprints) give the most accurate crossvalidation results. R Random Forests, Support Vector Machines and Tensorflow models have similar accuracies with balanced sensitivity (true position rate) and specificity (true negative rate). @@ -553,8 +553,8 @@ analysis of `lazar` lowest observed effect level predictions, which are also similar to the experimental variability (@Helma2018). The lowest number of predictions ({{lazar-padel-high-confidence.n}}) has been -obtained from `lazar`/PaDEL high confidence predictions, the largest number of -predictions comes from Tensorflow models ({{tensorflow-all.n}}). Standard +obtained from `lazar`-PaDEL high confidence predictions, the largest number of +predictions comes from Tensorflow models ({{tensorflow-rf.n}}). Standard `lazar` give a slightly lower number of predictions ({{lazar-all.n}}) than R and Tensorflow models. This is not necessarily a disadvantage, because `lazar` abstains from predictions, if the query compound is very dissimilar from the @@ -712,7 +712,7 @@ corresponding tertiary PAs. However, in the groups of modification of the necine base, dehydropyrrolizidine, the toxic principle of PAs, should have had the highest genotoxic potential. Taken together, the predictions of the modifications of the necine base from the LAZAR, RF -and R-generated DL model cannot -- in contrast to the Tensorflow DL +and R-generated DL model cannot - in contrast to the Tensorflow DL model - be considered as reliable. Overall, when comparing the prediction results of the PAs to current @@ -753,7 +753,7 @@ A new public *Salmonella* mutagenicity training dataset with 8309 compounds was created and used it to train `lazar`, R and Tensorflow models with MolPrint2D and PaDEL descriptors. The best performance was obtained with `lazar` models using MolPrint2D descriptors, with prediction accuracies -({{lazar.-high-confidence.acc}}) comparable to the interlaboratory variability +({{lazar-high-confidence.acc}}) comparable to the interlaboratory variability of the Ames test (80-85%). Models based on PaDEL descriptors had lower accuracies than MolPrint2D models, but only the `lazar` algorithm could use MolPrint2D descriptors. diff --git a/mutagenicity.pdf b/mutagenicity.pdf index a26a28e..7c04ef7 100644 Binary files a/mutagenicity.pdf and b/mutagenicity.pdf differ diff --git a/scripts/pa-table.rb b/scripts/pa-table.rb index af3ab21..032aaa1 100755 --- a/scripts/pa-table.rb +++ b/scripts/pa-table.rb @@ -8,7 +8,7 @@ puts ' \definecolor{grey}{rgb}{0.5,0.5,0.5} \tiny \begin{longtable}{rcccccc} -\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red|green: low confidence} \\\\ +\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red/green: low confidence} \\\\ \label{tab:pa} & & \multicolumn{2}{c}{lazar} & \multicolumn{3}{c}{R} \\\\ ' diff --git a/tables/pa-tab.tex b/tables/pa-tab.tex index 15e02a7..22b5bfe 100644 --- a/tables/pa-tab.tex +++ b/tables/pa-tab.tex @@ -6,7 +6,7 @@ \definecolor{grey}{rgb}{0.5,0.5,0.5} \tiny \begin{longtable}{rcccccc} -\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red|green: low confidence} \\ +\caption{Summary of pyrrolizidine alkaloid predictions: red: mutagen, green: non-mutagen, grey: no prediction, dark red/green: low confidence} \\ \label{tab:pa} & & \multicolumn{2}{c}{lazar} & \multicolumn{3}{c}{R} \\ ID & Measured & MP2D & PaDEL & DL & RF & SVM \\ -- cgit v1.2.3