summaryrefslogtreecommitdiff
path: root/mutagenicity.md
diff options
context:
space:
mode:
Diffstat (limited to 'mutagenicity.md')
-rw-r--r--mutagenicity.md88
1 files changed, 61 insertions, 27 deletions
diff --git a/mutagenicity.md b/mutagenicity.md
index 9c5f427..c80bdf1 100644
--- a/mutagenicity.md
+++ b/mutagenicity.md
@@ -10,6 +10,8 @@ author:
institute: insel
- Jürgen Drewe:
institute: zeller, unibas
+ email: juergendrewe@zellerag.ch
+ correspondence: "yes"
- Philipp Boss:
institute: sysbio
@@ -33,11 +35,12 @@ institute:
bibliography: bibliography.bib
keywords: mutagenicity, QSAR, lazar, random forest, support vector machine, linear regression, neural nets, deep learning, pyrrolizidine alkaloids, OpenBabel, CDK
+#documentclass: frontiersHLTH
documentclass: scrartcl
tblPrefix: Table
figPrefix: Figure
header-includes:
- - \usepackage{lineno, setspace, color, colortbl, longtable}
+ - \usepackage{lineno, color, setspace}
- \doublespacing
- \linenumbers
...
@@ -69,13 +72,14 @@ Computer based (*in silico*) mutagenicity predictions can be used in the early
screening of novel compounds (e.g. drug candidates), but they are also gaining
regulatory acceptance e.g. for the registration of industrial chemicals within
REACH (@ECHA2017) or the assessment of impurities in pharmaceuticals (ICH M7
-guideline, @ICH2017).
+guideline, Harmonisation of Technical Requirements for Pharmaceuticals for
+Human Use @ICH2017).
-*Salmonella* mutagenicity is at the moment the toxicological endpoint with the
+Currently, *Salmonella* mutagenicity is the toxicological endpoint with the
largest amount of public data for almost 10000 structures, whereas datasets for
other endpoints contain typically only a few hundred compounds. The Ames test
itself is relatively reproducible with an interlaboratory variability of 80-85%
-(@Benigni1988).
+(@Piegorsch1991).
This makes the development of mutagenicity models also interesting from a
computational chemistry and machine learning point of view. The relatively
@@ -148,8 +152,8 @@ under a GPL3 License. The new combined dataset can be found at
The pyrrolizidine alkaloid dataset was created from five independent, necine
base substructure searches in PubChem (https://pubchem.ncbi.nlm.nih.gov/) and
compared to the PAs listed in the EFSA publication @EFSA2011 and the book by
-Mattocks @Mattocks1986, to ensure, that all major PAs were included. PAs
-mentioned in these publications which were not found in the downloaded
+@Mattocks1986, to ensure, that all major PAs were included. PAs
+mentioned in these publications, which were not found in the downloaded
substances were searched individually in PubChem and, if available, downloaded
separately. Non-PA substances, duplicates, and isomers were removed from the
files, but artificial PAs, even if unlikely to occur in nature, were kept. The
@@ -193,7 +197,7 @@ In contrast to predefined lists of fragments (e.g. FP3, FP4 or MACCs
fingerprints) or descriptors (e.g CDK) they are generated dynamically from
chemical structures. This has the advantage that they can capture unknown
substructures of toxicological relevance that are not included in other
-descriptors. In addition they allow the efficient calculation of chemical
+descriptors. In addition, they allow the efficient calculation of chemical
similarities (e.g. Tanimoto indices) with simple set operations.
MolPrint2D fingerprints were calculated with the OpenBabel cheminformatics
@@ -297,7 +301,7 @@ absence of closely related neighbours, we follow a tiered approach:
flagged with a warning that it might be out of the applicability domain of
the training data (*low confidence*).
-- These Similarity thresholds are the default values chosen
+- These similarity thresholds are the default values chosen
by software developers and remained unchanged during the
course of these experiments.
@@ -368,13 +372,13 @@ to a uniform distribution. MP2D features were not preprocessed.
#### Random forests (*RF*)
For the random forest classifier we used the parameters
-n_estimators=1000and max_leaf_nodes=200. For the other parameters we
+n_estimators=1000 and max_leaf_nodes=200. For the other parameters we
used the scikit-learn default values.
#### Logistic regression (SGD) (*LR-sgd*)
For the logistic regression we used an ensemble of five trained models.
-For each model we used a batch size of 64 and trained for 50 epoch. As
+For each model we used a batch size of 64 and trained for 50 epochs. As
an optimizer ADAM was chosen. For the other parameters we used the
tensorflow default values.
@@ -386,7 +390,7 @@ default values.
#### Neural Nets (*NN*)
For the neural network we used an ensemble of five trained models. For
-each model we used a batch size of 64 and trained for 50 epoch. As an
+each model we used a batch size of 64 and trained for 50 epochs. As an
optimizer ADAM was chosen. The neural network had 4 hidden layers with
64 nodes each and a ReLu activation function. For the other parameters
we used the tensorflow default values.
@@ -467,7 +471,7 @@ https://git.in-silico.ch/mutagenicity-paper/tree/crossvalidations/predictions/.
All investigated algorithm/descriptor combinations
give accuracies between (80 and 85%) which is equivalent to the experimental
variability of the *Salmonella typhimurium* mutagenicity bioassay (80-85%,
-@Benigni1988). Sensitivities and specificities are balanced in all of
+@Piegorsch1991). Sensitivities and specificities are balanced in all of
these models.
Pyrrolizidine alkaloid mutagenicity predictions
@@ -638,16 +642,16 @@ frequently *local models*, because models are generated specifically for each
query compound. The investigated tensorflow models are in contrast *global
models*, i.e. a single model is used to make predictions for all compounds. It
has been postulated in the past, that local models are more accurate, because
-they can account better for mechanisms, that affect only a subset of the
+they can account better for mechanisms that affect only a subset of the
training data.
@tbl:cv-mp2d, @tbl:cv-cdk and @fig:roc show that the crossvalidation accuracies
of all models are comparable to the experimental variability of the *Salmonella
-typhimurium* mutagenicity bioassay (80-85% according to @Benigni1988). All of
-these models have balanced sensitivity (true position rate) and specificity
+typhimurium* mutagenicity bioassay (80-85% according to @Piegorsch1991). All of
+these models have balanced sensitivity (true positive rate) and specificity
(true negative rate) and provide highly significant concordance with
experimental data (as determined by McNemar's Test). This is a clear indication
-that *in-silico* predictions can be as reliable as the bioassays. Given that
+that *in silico* predictions can be as reliable as the bioassays. Given that
the variability of experimental data is similar to model variability it is
impossible to decide which model gives the most accurate predictions, as models
with higher accuracies might just approximate experimental errors better than
@@ -663,11 +667,16 @@ depend more on practical considerations than on intrinsic properties. Nearest
neighbor algorithms like `lazar` have the practical advantage that the
rationales for individual predictions can be presented in a straightforward
manner that is understandable without a background in statistics or machine
-learning (@fig:lazar). This allows a critical examination of individual
-predictions and prevents blind trust in models that are intransparent to users
-with a toxicological background.
+learning (a screenshot of the mutagenicity prediction for
+12,21-Dihydroxy-4-methyl-4,8-secosenecinonan-8,11,16-trione can be found at
+https://git.in-silico.ch/mutagenicity-paper/tree/figures/lazar-screenshot.png).
+This allows a critical examination of individual predictions and prevents blind
+trust in models that are intransparent to users with a toxicological
+background.
-![Lazar screenshot of 12,21-Dihydroxy-4-methyl-4,8-secosenecinonan-8,11,16-trione mutagenicity prediction](figures/lazar-screenshot.png){#fig:lazar}
+<!--
+![`lazar` screenshot of 12,21-Dihydroxy-4-methyl-4,8-secosenecinonan-8,11,16-trione mutagenicity prediction](figures/lazar-screenshot.png){#fig:lazar}
+-->
Descriptors
-----------
@@ -776,27 +785,30 @@ retronecine-type (@Li2013). 
### Modifications of necine base
The group-specific results reflect the expected relationship between the
-groups: the low mutagenic potential of N-oxides and the high potential of
-Dehydropyrrolizidines (DHP) (@Chen2010). 
+groups: the low mutagenic potential of *N*-oxides and the high potential of
+dehydropyrrolizidines (DHP) (@Chen2010). 
+However, *N*-oxides may be *in vivo* converted back to their parent toxic/tumorigenic parent PA (@Yan2008),  on the other hand they are highly water soluble and generally considered as detoxification products, which are *in vivo* quickly renally eliminated (@Chen2010).
-Dehydropyrrolizidines are regarded as the toxic principle in the metabolism of
-PAs, and known to produce protein- and DNA-adducts (@Chen2010). None of the
-models did not meet this expectation and predicted the majority of DHP as
+DHP are regarded as the toxic principle in the metabolism of
+PAs, and are known to produce protein- and DNA-adducts (@Chen2010). None of our investigated
+models did meet this expectation and all of them predicted the majority of DHP as
non-mutagenic. However, the following issues need to be considered. On the one
-hand, all DHP were outside of the stricter applicability domain of MP2D lazar.
+hand, all DHP were outside of the stricter applicability domain of MP2D `lazar`.
This indicates that they are structurally very different than the training data
and might be out of the applicability domain of all models based on this
training set. In addition, DHP has two unsaturated double bounds in its necine
base, making it highly reactive. DHP and other comparable molecules have a very
-short lifespan, and usually cannot be used in *in vitro* experiments.
+short lifespan *in vivo*, and usually cannot be used in *in vitro* experiments.
<!--
Furthermore, the probabilities for this substance groups needs to be considered, and not only the consolidated prediction. In the LAZAR model, all DHPs had probabilities for both outcomes (genotoxic and not genotoxic) mainly below 30%. Additionally, the probabilities for both outcomes were close together, often within 10% of each other. The fact that for both outcomes, the probabilities were low and close together, indicates a lower confidence in the prediction of the model for DHPs. 
-->
+<!--
PA N-oxides are easily conjugated for extraction, they are generally considered
as detoxification products, which are *in vivo* quickly renally eliminated
(@Chen2010).
+-->
Overall the low number of positive mutagenicity predictions was unexpected.
PAs are generally considered to be genotoxic, and the mode of action is also known.
@@ -812,6 +824,26 @@ appeared to have a low sensitivity. The pre-incubation phase for metabolic
activation of PAs by microsomal enzymes was the sensitivity-limiting step. This
could very well mean that the low sensitivity of the Ames test for PAs is also reflected in the investigated models.
+A *in vitro* screen of cellular PA effects (metabolic activation and mutagenic
+effects) in human and rodent hepatocytes (HepG2 and H-4-II-E) showed that
+results may also critically depend on the cellular model and cell culture
+conditions and may underestimate the effects of PAs (@Forsch2018).
+
+In summary, we found marked differences in the predicted genotoxic potential
+between the PA groups: most toxic appeared the otonecines and macrocyclic
+diesters, least toxic the platynecines and the mono- and diesters. These
+results are comparable with *in vitro* measurements in hepatic HepaRG cells
+(@Louisse2019), where relative potencies (RP) were determined: for otonecines
+and cyclic diesters RP = 1, for open diesters RP = 0.1 and for monoesters RP =
+0.01.
+
+Due to a lack of
+differential data, European authorities based their risk assessment in a
+worst-case approach on lasiocarpine, for which sufficient data on genotoxicity
+and carcinogenicity were available (@HMPC2014, @EMA2020). Our data further support a tiered risk assessment
+based on *in silico* and experimental data on the relative potency of
+individual PAs as already suggested by other authors (@Merz2016, @Rutz2020, @Louisse2019). 
+
<!--
non-conflicting CIDs
43040
@@ -894,6 +926,8 @@ in this context, because they present rationales for predictions (similar
compounds with experimental data) which can be accepted or rejected by
toxicologists and provide validated applicability domain estimations.
+Our data show that large difference exist with regard to genotoxic potential between different pyrrolizidine subgroups. These results may allow to adjust risk assessment of pyrrolizidine contamination.
+
<!---
in a form that is understandable and criticiseable by toxicologists without a machine learning background.