summaryrefslogtreecommitdiff
path: root/loael.Rmd
diff options
context:
space:
mode:
authorChristoph Helma <helma@in-silico.ch>2018-01-10 12:02:30 +0100
committerChristoph Helma <helma@in-silico.ch>2018-01-10 12:02:30 +0100
commit3800005bafa367ba3f09365459f7ec426483becf (patch)
treea9610a4bacbce27ab113ebd958a1f05b56f7b92c /loael.Rmd
parent160d9d4489a4d9ebed72db46c5bbe94f1d8131bc (diff)
remaining manuscript fixes
Diffstat (limited to 'loael.Rmd')
-rw-r--r--loael.Rmd18
1 files changed, 9 insertions, 9 deletions
diff --git a/loael.Rmd b/loael.Rmd
index 47ff2c6..3886c50 100644
--- a/loael.Rmd
+++ b/loael.Rmd
@@ -106,13 +106,13 @@ models in the context of experimental variability.
An important limitation often raised for computational toxicology is the lack
of transparency on published models and consequently on the difficulty for the
scientific community to reproduce and apply them. To overcome these issues,
-source code for all programs and libraries and the databases that have been used to generate this
-manuscript are made available under GPL3 licenses. Databases and compiled
+source code for all programs and libraries and the data that have been used to generate this
+manuscript are made available under GPL3 licenses. Data and compiled
programs with all dependencies for the reproduction of results in this manuscript are available as
a self-contained docker image. All data, tables and figures in this manuscript
was generated directly from experimental results using the `R` package `knitR`.
-A single command repeats all experiments (possibly with different settings) and
-updates the manuscript with the new results.
+<!-- A single command repeats all experiments (possibly with different settings) and
+updates the manuscript with the new results. -->
<!--
overcome these issues, all databases and programs that have been used to
@@ -462,7 +462,7 @@ c.mg$sd <- ave(c.mg$LOAEL,c.mg$SMILES,FUN=sd)
```
Both databases contain substances with multiple measurements, which allow the determination of experimental variabilities.
-For this purpose we have calculated the mean standard deviation of compounds with multiple measurements, which is roughly a factor of 2 for both databases.
+For this purpose we have calculated the mean standard deviation of compounds with multiple measurements. Mean standard deviations and thus experimental variabilities are similar for both databases.
The Nestlé database has `r length(m$SMILES)` LOAEL values for
`r length(levels(m$SMILES))` unique structures, `r m.dupnr` compounds have
@@ -493,10 +493,6 @@ The combined test set has a mean standard deviation (-log10 transformed values)
In order to compare the correlation of LOAEL values in both databases and to establish a reference for predicted values, we have investigated compounds, that occur in both databases.
-[@fig:comp] shows the experimental LOAEL variability of compounds occurring in
-both datasets (i.e. the *test* dataset) colored in blue (experimental). This is
-the baseline reference for the comparison with predicted values.
-
```{r echo=F}
data <- read.csv("data/median-correlation.csv",header=T)
cor <- cor.test(data$mazzatorta,data$swiss)
@@ -513,6 +509,10 @@ experimental variability. Correlation analysis shows a significant (p-value < 2
correlation between the experimental data in both databases with r\^2:
`r round(median.r.square,2)`, RMSE: `r round(median.rmse,2)`
+[@fig:comp] shows the experimental LOAEL variability of compounds occurring in
+both datasets (i.e. the *test* dataset) colored in blue (experimental). This is
+the baseline reference for the comparison with predicted values.
+
![Correlation of median LOAEL values from Nestlé and FSVO databases. Data with
identical values in both databases was removed from
analysis.](figures/median-correlation.pdf){#fig:datacorr}