diff options
author | Christoph Helma <helma@in-silico.ch> | 2020-10-16 18:28:18 +0200 |
---|---|---|
committer | Christoph Helma <helma@in-silico.ch> | 2020-10-16 18:28:18 +0200 |
commit | e288019a0f3eb691723944d6e47838d52cfdc21a (patch) | |
tree | 15df8933515f8d414971c0c9454dc37b88c3b739 /mutagenicity.md | |
parent | e3a32112611f263104c767fae8c6e1f2b95d505f (diff) |
pa prediction table integrated
Diffstat (limited to 'mutagenicity.md')
-rw-r--r-- | mutagenicity.md | 125 |
1 files changed, 7 insertions, 118 deletions
diff --git a/mutagenicity.md b/mutagenicity.md index b15ea54..4ac5a32 100644 --- a/mutagenicity.md +++ b/mutagenicity.md @@ -35,6 +35,7 @@ header-includes: - \usepackage{setspace} - \doublespacing - \usepackage{lineno} + - \usepackage{color, colortbl, longtable} - \linenumbers ... @@ -481,7 +482,7 @@ Results Crossvalidation results are summarized in the following tables: @tbl:lazar shows `lazar` results with MolPrint2D and PaDEL descriptors, @tbl:R R results and @tbl:tensorflow Tensorflow results. -```{#tbl:lazar .table file="tables/lazar-summary.csv" caption="Summary of lazar crossvalidation results (all predictions/high confidence predictions"} +```{#tbl:lazar .table file="tables/lazar-summary.csv" caption="Summary of lazar crossvalidation results (all/high confidence predictions"} ``` ```{#tbl:R .table file="tables/r-summary.csv" caption="Summary of R crossvalidation results"} @@ -499,133 +500,21 @@ http://git.in-silico.ch/mutagenicity-paper/10-fold-crossvalidations/predictions/ The most accurate crossvalidation predictions have been obtained with `lazar` models with MolPrint2D descriptors ({{lazar-high-confidence.acc}} for predictions with high confidence, {{lazar-all.acc}} for all predictions). Models utilizing PaDEL descriptors have generally lower accuracies ranging from TODO to TODO. Sensitivity and specificity is generally well balanced with the exception of `lazar`-PaDEL (low sensitivity) and R deep learning (low specificity) models. -<!-- -| |R-RF | R-SVM | R-DL | TF | TF-FS | L | L-HC | L-P | L-P-HC| -|-|-----|-------|------|----|-------|---|------|------|--------| -|Accuracy|{{R-RF.acc}}|{{R-SVM.acc}}|{{R-DL.acc}}|{{tensorflow-all.acc}}|{{tensorflow-selected.acc}}|{{lazar-all.acc}}|{{lazar-high-confidence.acc}}|{{lazar-padel-all.acc}}|{{lazar-padel-high-confidence.acc}}| -|Sensitivity|{{R-RF.tpr}}|{{R-SVM.tpr}}|{{R-DL.tpr}}|{{tensorflow-all.tpr}}|{{tensorflow-selected.tpr}}|{{lazar-all.tpr}}|{{lazar-high-confidence.tpr}}|{{lazar-padel-all.tpr}}|{{lazar-padel-high-confidence.tpr}}| -|Specificity|{{R-RF.tnr}}|{{R-SVM.tnr}}|{{R-DL.tnr}}|{{tensorflow-all.tnr}}|{{tensorflow-selected.tnr}}|{{lazar-all.tnr}}|{{lazar-high-confidence.tnr}}|{{lazar-padel-all.tnr}}|{{lazar-padel-high-confidence.tnr}}| -|PPV|{{R-RF.ppv}}|{{R-SVM.ppv}}|{{R-DL.ppv}}|{{tensorflow-all.ppv}}|{{tensorflow-selected.ppv}}|{{lazar-all.ppv}}|{{lazar-high-confidence.ppv}}|{{lazar-padel-all.ppv}}|{{lazar-padel-high-confidence.ppv}}| -|NPV|{{R-RF.npv}}|{{R-SVM.npv}}|{{R-DL.npv}}|{{tensorflow-all.npv}}|{{tensorflow-selected.npv}}|{{lazar-all.npv}}|{{lazar-high-confidence.npv}}|{{lazar-padel-all.npv}}|{{lazar-padel-high-confidence.npv}}| -|Nr. predictions|{{R-RF.n}}|{{R-SVM.n}}|{{R-DL.n}}|{{tensorflow-all.n}}|{{tensorflow-selected.n}}|{{lazar-all.n}}|{{lazar-high-confidence.n}}|{{lazar-padel-all.n}}|{{lazar-padel-high-confidence.n}}| - -: Summary of crossvalidation results. *R-RF*: R Random Forests, *R-SVM*: R Support Vector Machines, *R-DL*: R Deep Learning, *TF*: Tensorflow without feature selection, *TF-FS*: Tensorflow with feature selection, *L*: lazar, *L-HC*: lazar high confidence predictions, *L-P*: lazar with PaDEL descriptors, *L-P-HC*: lazar PaDEL high confidence predictions, *PPV*: Positive predictive value (Precision), *NPV*: Negative predictive value {#tbl:summary} - -R Models --------- - -### Random Forest - -10-fold crossvalidation of the R-RF model gave an accuracy of -{{R-RF.acc_perc}}%, a sensitivity of {{R-RF.tpr_perc}}% and a specificity of -{{R-RF.tnr_perc}}%. The confusion matrix for {{R-RF.n}} -predictions is provided in @tbl:R-RF. - -```{#tbl:R-RF .table file="tables/R-RF.csv" caption="Confusion matrix for R Random Forest predictions"} -``` - -### Support Vector Machines - -10-fold crossvalidation of the R-SVM model gave an accuracy of -{{R-SVM.acc_perc}}%, a sensitivity of {{R-SVM.tpr_perc}}% and a specificity of -{{R-SVM.tnr_perc}}%. The confusion matrix for {{R-SVM.n}} -predictions is provided in @tbl:R-SVM. - -```{#tbl:R-SVM .table file="tables/R-SVM.csv" caption="Confusion matrix for R Support Vector Machine predictions"} -``` - -### Deep Learning - -10-fold crossvalidation of the R-DL model gave an accuracy of -{{R-DL.acc_perc}}%, a sensitivity of {{R-DL.tpr_perc}}% and a specificity of -{{R-DL.tnr_perc}}%. The confusion matrix for {{R-DL.n}} -predictions is provided in @tbl:R-DL. - -```{#tbl:R-DL .table file="tables/R-DL.csv" caption="Confusion matrix for R Deep Learning predictions"} -``` - -Tensorflow Models ------------------ - -### Without feature selection - -10-fold crossvalidation of the Tensorflow DL model gave an accuracy of -{{tensorflow-all.acc_perc}}%, a sensitivity of {{tensorflow-all.tpr_perc}}% and a specificity of -{{tensorflow-all.tnr_perc}}%. The confusion matrix for {{tensorflow-all.n}} -predictions is provided in @tbl:tensorflow-all. - -```{#tbl:tensorflow-all .table file="tables/tensorflow-all.csv" caption="Confusion matrix for Tensorflow predictions without feature selecetion"} -``` - -### With feature selection - -10-fold crossvalidation of the Tensorflow model with feature selection gave an accuracy of -{{tensorflow-selected.acc_perc}}%, a sensitivity of {{tensorflow-selected.tpr_perc}}% and a specificity of -{{tensorflow-selected.tnr_perc}}%. The confusion matrix for {{tensorflow-selected.n}} -predictions is provided in @tbl:tensorflow-selected. - -```{#tbl:tensorflow-selected .table file="tables/tensorflow-selected.csv" caption="Confusion matrix for Tensorflow predictions with feature selecetion"} -``` - -`lazar` Models --------------- - -### MolPrint2D Descriptors - -10-fold crossvalidation of the lazar model with MolPrint2D descriptors gave an accuracy of -{{lazar-all.acc_perc}}%, a sensitivity of {{lazar-all.tpr_perc}}% and a specificity of -{{lazar-all.tnr_perc}}%. -The confusion matrix for {{lazar-all.n}} -predictions is provided in @tbl:lazar-all. - -```{#tbl:lazar-all .table file="tables/lazar-all.csv" caption="Confusion matrix for lazar predictions with MolPrint2D descriptors"} -``` - -Predictions with high confidence had an accuracy of -{{lazar-high-confidence.acc_perc}}%, a sensitivity of {{lazar-high-confidence.tpr_perc}}% and a specificity of -{{lazar-high-confidence.tnr_perc}}%. -The confusion matrix for {{lazar-high-confidence.n}} -predictions is provided in @tbl:lazar-high-confidence. - - -```{#tbl:lazar-high-confidence .table file="tables/lazar-high-confidence.csv" caption="Confusion matrix for high confidence lazar predictions with MolPrint2D descriptors"} -``` - -### PaDEL Descriptors - -10-fold crossvalidation of the lazar model with PaDEL descriptors gave an accuracy of -{{lazar-all.acc_perc}}%, a sensitivity of {{lazar-all.tpr_perc}}% and a specificity of -{{lazar-all.tnr_perc}}%. -The confusion matrix for {{lazar-all.n}} -predictions is provided in @tbl:lazar-padel-all. - -```{#tbl:lazar-padel-all .table file="tables/lazar-padel-all.csv" caption="Confusion matrix for lazar predictions with PaDEL descriptors" } -``` - -Predictions with high confidence had an accuracy of -{{lazar-high-confidence.acc_perc}}%, a sensitivity of {{lazar-high-confidence.tpr_perc}}% and a specificity of -{{lazar-high-confidence.tnr_perc}}%. -The confusion matrix for {{lazar-high-confidence.n}} -predictions is provided in @tbl:lazar-padel-high-confidence. - -```{#tbl:lazar-padel-high-confidence .table file="tables/lazar-padel-high-confidence.csv" caption="Confusion matrix for high confidence lazar predictions with PaDEL descriptors"} -``` ---> Pyrrolizidine alkaloid mutagenicity predictions ----------------------------------------------- -Pyrrolizidine alkaloid mutagenicity predictions are summarized in Table @tab:pa. +Pyrrolizidine alkaloid mutagenicity predictions are summarized in @tab:pa. @fig:tsne-mp2d shows the position of pyrrolizidine alkaloids (PA) in the mutagenicity training dataset in MP2D space -![t-sne visualisation of mutagenicty training data and pyrrolizidine alkaloids (PA)](figures/tsne-mp2d.png){#fig:tsne-mp2d} +\input{tables/pa-tab.tex} -@fig:tsne-padel shows the position of pyrrolizidine alkaloids (PA) in the mutagenicity training dataset in PADEL space +![t-sne visualisation of mutagenicity training data and pyrrolizidine alkaloids (PA)](figures/tsne-mp2d.png){#fig:tsne-mp2d} -![t-sne visualisation of mutagenicty training data and pyrrolizidine alkaloids (PA)](figures/tsne-padel.png){#fig:tsne-padel} +@fig:tsne-padel shows the position of pyrrolizidine alkaloids (PA) in the mutagenicity training dataset in PADEL space -\input{pa-tab.tex} +![t-sne visualisation of mutagenicity training data and pyrrolizidine alkaloids (PA)](figures/tsne-padel.png){#fig:tsne-padel} Discussion ========== |