190501_Genotox-PA.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325

Prediction of the mutagenic potential of different pyrrolizidine
alkaloids using LAZAR, Random Forest, Support Vector Machines, and Deep
Learning

Authors

Verena Schöning, Christoph Helma, Philipp Boss, Jürgen Drewe

**Manuscript in preparation.**

Corresponding author:

Prof. Dr. Jürgen Drewe, MSc

Abstract
========

Pyrrolizidine alkaloids (PAs) are secondary plant metabolites of some
plant families, which protect against predators and generally considered
as genotoxic and mutagenic. This mutagenicity is also the point of
concern in regulatory risk assessment of this substance group [EFSA
2011](#_ENREF_36)[EMA 2014](#_ENREF_38)[2016](#_ENREF_39)(; ; ). Several
investigations already showed that the mutagenic potential of PAs is
different, and largely depends on the structure.

Since only very few of over 600 known PAs are available for *in vitro*
or *in vivo* experiments, the mutagenicity of PAs in this study was
estimated using four different machine learning techniques LAZAR and
Deep Learning, Random Forest and Support Vector Machines. However, all
models were not optimal for predicting the genotoxic potential of PAs
either due to problems with the applicability domain or due to low
performance. Therefore, no estimation regarding the genotoxic potential
of single PAs could be made. An analysis of the genotoxic potential of
different structural groups, showed promising results. For necine base
and necic acid, the results fitted well with literature for three
models. However, the prediction of the toxic principle of PAs,
dehydropyrrolizidine was only within expectation in one model
(TensorFlow-generated Deep Learning model), but not in the other four
models. This study shows convincingly the need to critically review and
assess the predictions obtained from machine learning approaches by
internal cross-validation, but also by external validation through
comparison with literature.

Introduction
============

Pyrrolizidine alkaloids (PAs) are secondary plant ingredients found in
many plant species as protection against predators [Hartmann & Witte
1995](#_ENREF_59)[Langel et al. 2011](#_ENREF_76)(; ). PAs are ester
alkaloids, which are composed of a necine base (two fused five-membered
rings joined by a nitrogen atom) and one or two necic acid (carboxylic
ester arms). The necine base can have different structures and thereby
divides PAs into several structural groups, e.g. otonecine, platynecine,
and retronecine. The structural groups of the necic acid are macrocyclic
diester, open-ring diester and monoester [Langel et al.
2011](#_ENREF_76)().

PA are mainly metabolised in the liver, which is at the same time the
main target organ of toxicity [Bull & Dick 1959](#_ENREF_17)[Bull et al.
1958](#_ENREF_18)[Butler et al. 1970](#_ENREF_20)[DeLeve et al.
1996](#_ENREF_33)[Jago 1971](#_ENREF_65)[Li et al.
2011](#_ENREF_78)[Neumann et al. 2015](#_ENREF_99)(; ; ; ; ; ; ). There
are three principal metabolic pathways for 1,2-unsaturated PAs [Chen et
al. 2010](#_ENREF_26)(): (i) Detoxification by hydrolysis: the ester
bond on positions C7 and C9 are hydrolysed by non-specific esterases to
release necine base and necic acid, which are then subjected to further
phase II-conjugation and excretion. (ii) Detoxification by *N*-oxidation
of the necine base (only possible for retronecine-type PAs): the
nitrogen is oxidised to form a PA *N*-oxides, which can be conjugated by
phase II enzymes e.g. glutathione and then excreted. PA *N*-oxides can
be converted back into the corresponding parent PA [Wang et al.
2005](#_ENREF_134)(). (iii) Metabolic activation or toxification: PAs
are metabolic activated/ toxified by oxidation (for retronecine-type
PAs) or oxidative *N*-demethylation (for otonecine-type PAs [Lin
1998](#_ENREF_82)()). This pathway is mainly catalysed by cytochrome
P450 isoforms CYP2B and 3A [Ruan et al. 2014b](#_ENREF_115)(), and
results in the formation of dehydropyrrolizidines (DHP, also known as
pyrrolic ester or reactive pyrroles). DHPs are highly reactive and cause
damage in the cells where they are formed, usually hepatocytes. However,
they can also pass from the hepatocytes into the adjacent sinusoids and
damage the endothelial lining cells [Gao et al. 2015](#_ENREF_48)()
predominantly by reaction with protein, lipids and DNA. There is even
evidence, that conjugation of DHP to glutathione, which would generally
be considered a detoxification step, could result in reactive
metabolites, which might also lead to DNA adduct formation [Xia et al.
2015](#_ENREF_138)(). Due to the ability to form DNA adducts, DNA
crosslinks and DNA breaks 1,2-unsaturated PAs are generally considered
genotoxic and carcinogenic [Chen et al. 2010](#_ENREF_26)[EFSA
2011](#_ENREF_36)[Fu et al. 2004](#_ENREF_45)[Li et al.
2011](#_ENREF_78)[Takanashi et al. 1980](#_ENREF_126)[Yan et al.
2008](#_ENREF_140)[Zhao et al. 2012](#_ENREF_148)(; ; ; ; ; ; ). Still,
there is no evidence yet that PAs are carcinogenic in humans [ANZFA
2001](#_ENREF_4)[EMA 2016](#_ENREF_39)(; ). One general limitation of
studies with PAs is the number of different PAs investigated. Around 30
PAs are currently commercially available, therefore all studies focus on
these PAs. This is also true for *in vitro* and *in vivo* tests on
mutagenicity and genotoxicity. To gain a wider perspective, in this
study over 600 different PAs were assessed on their mutagenic potential
using four different machine learning techniques.

Materials and Methods
=====================

Training dataset
----------------

For all methods, the same validated training dataset was used. The
training dataset was compiled from the following sources:

-   Kazius/Bursi Dataset (4337 compounds, [Kazius et al.
    2005](#_ENREF_71)()):

> <http://cheminformatics.org/datasets/bursi/cas_4337.zip>

-   Hansen Dataset (6513 compounds, [Hansen et al. 2009](#_ENREF_57)()):

> <http://doc.ml.tu-berlin.de/toxbenchmark/Mutagenicity_N6512.csv>

-   EFSA Dataset (695 compounds, [EFSA 2011](#_ENREF_36)()):

> <https://data.europa.eu/euodp/data/storage/f/2017-0719T142131/GENOTOX%20data%20and%20dictionary.xls>

Mutagenicity classifications from Kazius and Hansen datasets were used
without further processing. To achieve consistency between these
datasets, EFSA compounds were classified as mutagenic, if at least one
positive result was found for TA98 or T100 Salmonella strains.

Dataset merges were based on unique SMILES (*Simplified Molecular Input
Line Entry Specification*) strings of the compound structures.
Duplicated experimental data with the same outcome was merged into a
single value, because it is likely that it originated from the same
experiment. Contradictory results were kept as multiple measurements in
the database. The combined training dataset contains 8281 unique
structures.

Source code for all data download, extraction and merge operations is
publicly available from the git repository
<https://git.in-silico.ch/pyrrolizidine> under a GPL3 License.

Testing dataset
---------------

The testing dataset consisted of 602 different PAs. The compilation of
the PA dataset is described in detail in [Schöning et al.
(2017)](#_ENREF_119). The PAs were assigned to groups according to
structural features of the necine base and necic acid.

For the necine base, following groups were assigned:

-   Retronecine-type (1,2-unstaturated necine base)

-   Otonecine-type (1,2-unstaturated necine base)

-   Platynecine-type (1,2-saturated necine base)

For the modification of necine base, following groups were assigned:

-   *N*-oxide-type

-   Tertiary-type (PAs which were neither from the *N*-oxide- nor
    > DHP-type)

-   DHP-type (dehydropyrrolizidine, pyrrolic ester)

For the necic acid, following groups were assigned:

-   Monoester-type

-   Open-ring diester-type

-   Macrocyclic diester-type

For the Random Forest (RF), Support Vector Machines (SVM), and Deep
Learning (DL) models, molecular descriptors of the PAs were calculated
using the program PaDEL-Descriptors (version 2.21) [Yap
2011](#_ENREF_142)[2014](#_ENREF_143)(; ). From these descriptors were
chosen, which were actually used for the generation of the DL model.

LAZAR
-----

LAZAR (*lazy structure activity relationships*) is a modular framework
for read-across model development and validation. It follows the
following basic workflow: For a given chemical structure LAZAR:

-   searches in a database for similar structures (neighbours) with
    experimental data,

-   builds a local QSAR model with these neighbours and

-   uses this model to predict the unknown activity of the query
    compound.

This procedure resembles an automated version of read across predictions
in toxicology, in machine learning terms it would be classified as a
k-nearest-neighbour algorithm.

Apart from this basic workflow, LAZAR is completely modular and allows
the researcher to use any algorithm for similarity searches and local
QSAR (*Quantitative structure--activity relationship*) modelling.
Algorithms used within this study are described in the following
sections.

### Neighbour identification

Similarity calculations were based on MolPrint2D fingerprints [Bender et
al. 2004](#_ENREF_8)() from the OpenBabel cheminformatics library
[O\'Boyle et al. 2011](#_ENREF_104)(). The MolPrint2D fingerprint uses
atom environments as molecular representation, which resembles basically
the chemical concept of functional groups. For each atom in a molecule,
it represents the chemical environment using the atom types of connected
atoms.

MolPrint2D fingerprints are generated dynamically from chemical
structures and do not rely on predefined lists of fragments (such as
OpenBabel FP3, FP4 or MACCs fingerprints or lists of
toxicophores/toxicophobes). This has the advantage that they may capture
substructures of toxicological relevance that are not included in other
fingerprints.

From MolPrint2D fingerprints a feature vector with all atom environments
of a compound can be constructed that can be used to calculate chemical
similarities.

The chemical similarity between two compounds a and b is expressed as
the proportion between atom environments common in both structures A ∩ B
and the total number of atom environments A U B (Jaccard/Tanimoto
index).

$$sim = \frac{\left| A\  \cap B \right|}{\left| A\  \cup B \right|}$$

Threshold selection is a trade-off between prediction accuracy (high
threshold) and the number of predictable compounds (low threshold). As
it is in many practical cases desirable to make predictions even in the
absence of closely related neighbours, we follow a tiered approach:

-   First a similarity threshold of 0.5 is used to collect neighbours,
    to create a local QSAR model and to make a prediction for the query
    compound.

-   If any of these steps fails, the procedure is repeated with a
    similarity threshold of 0.2 and the prediction is flagged with a
    warning that it might be out of the applicability domain of the
    training data.

-   Similarity thresholds of 0.5 and 0.2 are the default values chosen
    > by the software developers and remained unchanged during the
    > course of these experiments.

Compounds with the same structure as the query structure are
automatically eliminated from neighbours to obtain unbiased predictions
in the presence of duplicates.

### Local QSAR models and predictions

Only similar compounds (neighbours) above the threshold are used for
local QSAR models. In this investigation, we are using a weighted
majority vote from the neighbour's experimental data for mutagenicity
classifications. Probabilities for both classes
(mutagenic/non-mutagenic) are calculated according to the following
formula and the class with the higher probability is used as prediction
outcome.

$$p_{c} = \ \frac{\sum_{}^{}\text{sim}_{n,c}}{\sum_{}^{}\text{sim}_{n}}$$

$p_{c}$ Probability of class c (e.g. mutagenic or non-mutagenic)\
$\sum_{}^{}\text{sim}_{n,c}$ Sum of similarities of neighbours with
class c\
$\sum_{}^{}\text{sim}_{n}$ Sum of all neighbours

### Applicability domain

The applicability domain (AD) of LAZAR models is determined by the
structural diversity of the training data. If no similar compounds are
found in the training data no predictions will be generated. Warnings
are issued if the similarity threshold had to be lowered from 0.5 to 0.2
in order to enable predictions. Predictions without warnings can be
considered as close to the applicability domain and predictions with
warnings as more distant from the applicability domain. Quantitative
applicability domain information can be obtained from the similarities
of individual neighbours.

### Availability

-   LAZAR experiments for this manuscript:
    [https://git.in-silico.ch/pyrrolizidine](https://deref-gmx.net/mail/client/Yn0laI8dUvs/dereferrer/?redirectUrl=https%3A%2F%2Fgit.in-silico.ch%2Fpyrrolizidine)
    (source code, GPL3)

-   LAZAR framework:
    [https://git.in-silico.ch/lazar](https://deref-gmx.net/mail/client/v26UgZbKEpE/dereferrer/?redirectUrl=https%3A%2F%2Fgit.in-silico.ch%2Flazar)
    (source code, GPL3)

-   LAZAR GUI:
    [https://git.in-silico.ch/lazar-gui](https://deref-gmx.net/mail/client/QstEPrpbcqQ/dereferrer/?redirectUrl=https%3A%2F%2Fgit.in-silico.ch%2Flazar-gui)
    (source code, GPL3)

-   Public web interface:
    [https://lazar.in-silico.ch](https://deref-gmx.net/mail/client/Gln3hLem0DY/dereferrer/?redirectUrl=https%3A%2F%2Flazar.in-silico.ch)

Random Forest, Support Vector Machines, and Deep Learning in R-project
----------------------------------------------------------------------

In comparison to LAZAR, three other models (Random Forest (RF), Support
Vector Machines (SVM), and Deep Learning (DL)) were evaluated.

For the generation of these models, molecular 1D and 2D descriptors of
the training dataset were calculated using PaDEL-Descriptors (version
2.21) [Yap 2011](#_ENREF_142)[2014](#_ENREF_143)(; ).

As the training dataset contained over 8280 instances, it was decided to
delete instances with missing values during data pre-processing.
Furthermore, substances with equivocal outcome were removed. The final
training dataset contained 8080 instances with known mutagenic
potential. The RF, SVM, and DL models were generated using the R
software (R-project for Statistical Computing,
<https://www.r-project.org/>*;* version 3.3.1), specific R packages used
are identified for each step in the description below. During feature
selection, descriptor with near zero variance were removed using
'*NearZeroVar*'-function (package 'caret'). If the percentage of the
most common value was more than 90% or when the frequency ratio of the
most common value to the second most common value was greater than 95:5
(e.g. 95 instances of the most common value and only 5 or less instances
of the second most common value), a descriptor was classified as having
a near zero variance. After that, highly correlated descriptors were
removed using the '*findCorrelation*'-function (package 'caret') with a
cut-off of 0.9. This resulted in a training dataset with 516
descriptors. These descriptors were scaled to be in the range between 0
and 1 using the '*preProcess*'-function (package 'caret'). The scaling
routine was saved in order to apply the same scaling on the testing
dataset. As these three steps did not consider the outcome, it was
decided that they do not need to be included in the cross-validation of
the model. To further reduce the number of features, a LASSO (*least
absolute shrinkage and selection operator*) regression was performed
using the '*glmnet*'-function (package '*glmnet*'). The reduced dataset
was used for the generation of the pre-trained models.

For the RF model, the '*randomForest*'-function (package
'*randomForest*') was used. A forest with 1000 trees with maximal
terminal nodes of 200 was grown for the prediction.

The '*svm*'-function (package 'e1071') with a *radial basis function
kernel* was used for the SVM model.

The DL model was generated using the '*h2o.deeplearning*'-function
(package '*h2o*'). The DL contained four hidden layer with 70, 50, 50,
and 10 neurons, respectively. Other hyperparameter were set as follows:
l1=1.0E-7, l2=1.0E-11, epsilon = 1.0E-10, rho = 0.8, and quantile\_alpha
= 0.5. For all other hyperparameter, the default values were used.
Weights and biases were in a first step determined with an unsupervised
DL model. These values were then used for the actual, supervised DL
model.

To validate these models, an internal cross-validation approach was
chosen. The training dataset was randomly split in training data, which
contained 95% of the data, and validation data, which contain 5% of the
data. A feature selection with LASSO on the training data was performed,
reducing the number of descriptors to approximately 100. This step was
repeated five times. Based on each of the five different training data,
the predictive models were trained and the performance tested with the
validation data. This step was repeated 10 times. Furthermore, a
y-randomisation using the RF model was performed. During
y-randomisation, the outcome (y-variable) is randomly permuted. The
theory is that after randomisation of the outcome, the model should not
be able to correlate the outcome to the properties (descriptor values)
of the substances. The performance of the model should therefore
indicate a by change prediction with an accuracy of about 50%. If this
is true, it can be concluded that correlation between actual outcome and
properties of the substances is real and not by chance [Rücker et al.
2007](#_ENREF_117)().

![](./media/media/image1.png){width="6.26875in"
height="5.486111111111111in"}

Figure 1: Flowchart of the generation and validation of the models
generated in R-project

Deep Learning in TensorFlow
---------------------------

Alternatively, a DL model was established with Python-based TensorFlow
program (<https://www.tensorflow.org/>) using the high-level API Keras
(<https://www.tensorflow.org/guide/keras>) to build the models.

Data pre-processing was done by rank transformation using the
'*QuantileTransformer*' procedure. A sequential model has been used.
Four layers have been used: input layer, two hidden layers (with 12, 8
and 8 nodes, respectively) and one output layer. For the output layer, a
sigmoidal activation function and for all other layers the ReLU
('*Rectified Linear Unit*') activation function was used. Additionally,
a L^2^-penalty of 0.001 was used for the input layer. For training of
the model, the ADAM algorithm was used to minimise the cross-entropy
loss using the default parameters of Keras. Training was performed for
100 epochs with a batch size of 64. The model was implemented with
Python 3.6 and Keras. For training of the model, a 6-fold
cross-validation was used. Accuracy was estimated by ROC-AUC and
confusion matrix.

Results
=======

LAZAR
-----

For 46 PAs, no prediction could be made. 26 PAs had no neighbours and 20
PAs had only one neighbour. For additional 396 PAs, the similarity
threshold had to be reduced from 0.5 to 0.2 to obtain enough neighbours
for a prediction. This means that these substances might not be within
the applicability domain (AD). Therefore, only 160 of 602 PAs were well
within the stricter AD with the similarity threshold of 0.5 and 556 PAs
in the AD with the similarity threshold of 0.2.

![](./media/media/image2.png){width="5.905511811023622in"
height="3.868241469816273in"}

Figure 2: Genotoxic potential of the different PA groups as predicted by
LAZAR, using the **similarity threshold** **of 0.5**.

*Genotoxic*: percentage number of compounds per group, which were
predicted to be genotoxic.\
*Not genotoxic*: percentage number of compounds per group, which were
predicted to be not genotoxic\
*Outside AD*: percentage number of compounds per group, which were
outside the applicability domain (AD).

![](./media/media/image3.png){width="5.905511811023622in"
height="3.868241469816273in"}

Figure 3: Genotoxic potential of the different PA groups as predicted by
LAZAR, using the **similarity threshold of 0.2**

*Genotoxic*: percentage number of compounds per group, which were
predicted to be genotoxic.\
*Not genotoxic*: percentage number of compounds per group, which were
predicted to be not genotoxic\
*Outside AD*: percentage number of compounds per group, which were
outside the applicability domain (AD).

Interestingly, using both similarity thresholds (e.g. 0.2 and 0.5), the
majority of PAs in all groups except otonecine, were predicted to be not
genotoxic.

The following rank order for genotoxicity probability can be deduced
from the results of both similarity thresholds:

-   Necine base: platynecine ≤ retronecine \<\< otonecine

-   Necic acid: monoester \< diester \< macrocyclic diester

-   Modification of necine base: *N*-oxide \< DHP \< tertiary PA

Random Forest, Support Vector Machines, and Deep Learning
---------------------------------------------------------

Applicability domain

The AD of the training dataset and the PA dataset was evaluated using
the Jaccard distance. A Jaccard distance of '0' indicates that the
substances are similar, whereas a value of '1' shows that the substances
are different. The Jaccard distance was below 0.2 for all PAs relative
to the training dataset. Therefore, PA dataset is within the AD of the
training dataset and the models can be used to predict the genotoxic
potential of the PA dataset.

y-randomisation

After y-randomisation of the outcome, the accuracy and CCR are around
50%, indicating a chance in the distribution of the results. This shows,
that the outcome is actually related to the predictors and not by
chance.

Random Forest

The validation showed that the RF model has an accuracy of 64%, a
sensitivity of 66% and a specificity of 63%. The confusion matrix of the
model, calculated for 8080 instances, is provided in Table 1.

Table 1: Confusion matrix of the RF model

                          Predicted genotoxicity                         
  ----------------------- ------------------------ ---------- ---------- -------------
  Measured genotoxicity                            ***PP***   ***PN***   ***Total***
                          ***TP***                 2274       1163       3437
                          ***TN***                 1736       2907       4643
                          ***Total***              4010       4070       8080

PP: Predicted positive; PN: Predicted negative, TP: True positive, TN:
True negative

In general, the majority of PAs were considered to be not genotoxic by
the RF model (Figure 4).

![](./media/media/image4.png){width="6.063194444444444in"
height="3.8756944444444446in"}

Figure 4: Genotoxic potential of the different PA groups as predicted by
**RF model**

*Genotoxic*: percentage number of compounds per group, which was
predicted to be genotoxic.\
*Not genotoxic*: percentage number of compounds per group, which was
predicted to be not genotoxic.

From the results, the following rank orders of genotoxic potential could
be deduced:

-   Necine base: platynecine \< retronecine \< otonecine

-   Necic acid: monoester (= 0%) \< diester \< macrocyclic diester

-   Modification of necine base: *N*-oxide = dehydropyrrolizidine (0%)
    \< tertiary PA

Support Vector Machines

The validation showed that the SVM model has an accuracy of 62%, a
sensitivity of 65% and a specificity of 60%. The confusion matrix of SVM
model, calculated for 8080 instances, is provided in Table 2.

Table 2: Confusion matrix of the SVM model

                          Predicted genotoxicity                         
  ----------------------- ------------------------ ---------- ---------- -------------
  Measured genotoxicity                            ***PP***   ***PN***   ***Total***
                          ***TP***                 2057       1107       3164
                          ***TN***                 1953       2963       4916
                          ***Total***              4010       4070       8080

PP: Predicted positive; PN: Predicted negative, TP: True positive, TN:
True negative

In the SVM model, also the majority of PAs were considered to be not
genotoxic (Figure 5).

![](./media/media/image5.png){width="6.063194444444444in"
height="3.9694444444444446in"}

Figure 5: Genotoxic potential of the different PA groups as predicted by
**SVM model**

*Genotoxic*: percentage number of compounds per group, which was
predicted to be genotoxic.\
*Not genotoxic*: percentage number of compounds per group, which was
predicted to be not genotoxic

From the results, the following rank orders of genotoxic potential could
be deduced:

-   Necine base: otonecine \< platynecine = retronecine

-   Necic acid: macrocyclic diester \< monoester = diester

-   Modification of necine base: dehydropyrrolizidine \< tertiary
    PA \< *N*-oxide 

Deep Learning (R-project)

The validation showed that the DL model generated in R has an accuracy
of 59%, a sensitivity of 89% and a specificity of 30%. The confusion
matrix of the model, normalised to 8080 instances, is provided in Table
3.

Table 3: Confusion matrix of the DL model (R-project)

                          Predicted genotoxicity                         
  ----------------------- ------------------------ ---------- ---------- -------------
  Measured genotoxicity                            ***PP***   ***PN***   ***Total***
                          ***TP***                 3575       435        4010
                          ***TN***                 2853       1217       4070
                          ***Total***              6428       1652       8080

PP: Predicted positive; PN: Predicted negative, TP: True positive, TN:
True negative

In contrast, the majority of PAs were considered to be genotoxic by the
DL model in R (Figure 6).

![](./media/media/image6.png){width="6.063194444444444in"
height="3.982638888888889in"}

Figure 6: Genotoxic potential of the different PA groups as predicted by
**DL model (R-project)**

*Genotoxic*: percentage number of compounds per group, which was
predicted to be genotoxic.\
*Not genotoxic*: percentage number of compounds per group, which was
predicted to be not genotoxic

From the results, the following rank orders of genotoxic potential could
be proposed:

-   Necine base: platynecine \< retronecine \< otonecine

-   Necic acid: monoester \< diester \< macrocyclic diester

-   Modification of necine base: tertiary PA = dehydropyrrolizidine \<
    *N*-oxide.

DL model (TensorFlow)

The validation showed that the DL model generated in TensorFlow has an
accuracy of 68%, a sensitivity of 70% and a specificity of 46%. The
confusion matrix of the model, normalised to 8080 instances, is provided
in Table 4.

Table 4: Confusion matrix of the DL model (TensorFlow)

                          Predicted genotoxicity                         
  ----------------------- ------------------------ ---------- ---------- -------------
  Measured genotoxicity                            ***PP***   ***PN***   ***Total***
                          ***TP***                 2851       1227       4078
                          ***TN***                 1825       2177       4002
                          ***Total***              4676       3404       8080

PP: Predicted positive; PN: Predicted negative, TP: True positive, TN:
True negative

The ROC curves from the 6-fold validation are shown in Figure 7.

![C:\\Users\\JDrewe\\AppData\\Local\\Microsoft\\Windows\\INetCache\\Content.MSO\\7CFE5F13.tmp](./media/media/image7.png){width="3.825in"
height="2.7327045056867894in"}

Figure 7: Six-fold cross-validation of TensorFlow DL model show an
average area under the ROC-curve (ROC-AUC; measure of accuracy) of 68%.

In contrast to the DL generated in R, the DL model generated in
TensorFlow predicted the majority of PAs as not genotoxic.

![C:\\Users\\JDrewe\\AppData\\Local\\Microsoft\\Windows\\INetCache\\Content.MSO\\4F678848.tmp](./media/media/image8.png){width="6.26875in"
height="3.6993055555555556in"}

Figure 8: Genotoxic potential of the different PA groups as predicted by
**DL model (TensorFlow)**

*Genotoxic*: percentage number of compounds per group, which was
predicted to be genotoxic.\
*Not genotoxic*: percentage number of compounds per group, which was
predicted to be not genotoxic

The following rank orders of genotoxic potential could be proposed based
on the results:

-   Necine base: platynecine \< otonecine \< retronecine 

-   Necic acid: monoester \< diester \< macrocyclic diester

-   Modification of necine base: tertiary PA \< *N*-oxide \<\<
    dehydropyrrolizidine.

In summary, the validation results of the four methods are presented in
the following table.

Table 5 Results of the cross-validation of the four models and after
y-randomisation

  ----------------------------------------------------------------------
                          Accuracy   CCR     Sensitivity   Specificity
  ----------------------- ---------- ------- ------------- -------------
  RF model                64.1%      64.4%   66.2%         62.6%

  SVM model               62.1%      62.6%   65.0%         60.3%

  DL model\               59.3%      59.5%   89.2%         29.9%
  (R-project)                                              

  DL model (TensorFlow)   68%        62.2%   69.9%         45.6%

  y-randomisation         50.5%      50.4%   50.3%         50.6%
  ----------------------------------------------------------------------

CCR (correct classification rate)

Discussion
==========

General model performance

Based on the results of the cross-validation for all models, LAZAR, RF,
SVM, DL (R-project) and DL (TensorFlow) it can be state that the
prediction results are not optimal due to different reasons. The
accuracy as measured during cross-validation of the four models (RF,
SVM, DL (R-project and TensorFlow)) was partly low with CCR values
between 59.3 and 68%, with the R-generated DL model and the
TensorFlow-generated DL model showing the worst and the best
performance, respectively. The validation of the R-generated DL model
revealed a high sensitivity (89.2%) but an unacceptably low specificity
of 29.9% indicating a high number of false positive estimates. The
TensorFlow-generated DL model, however, showed an acceptable but not
optimal accuracy of 68%, a sensitivity of 69.9% and a specificity of
45.6%. The low specificity indicates that both DL models tends to
predict too many instances as positive (genotoxic), and therefore have a
high false positive rate. This allows at least with the TensorFlow
generated DL model to make group statements, but the confidence for
estimations of single PAs appears to be insufficiently low.

Several factors have likely contributed to the low to moderate
performance of the used methods as shown during the cross-validation:

1.  The outcome in the training dataset was based on the results of AMES
    tests for genotoxicity [ICH 2011](#_ENREF_63)(), an *in vitro* test
    in different strains of the bacteria *Salmonella typhimurium*. In
    this test, mutagenicity is evaluated with and without prior
    metabolic activation of the test substance. Metabolic activation
    could result in the formation of genotoxic metabolites from
    non-genotoxic parent compounds. However, no distinction was made in
    the training dataset between substances that needed metabolic
    activation before being mutagenic and those that were mutagenic
    without metabolic activation. LAZAR is able to handle this
    'inaccuracy' in the training dataset well due to the way the
    algorithm works: LAZAR predicts the genotoxic potential based on the
    neighbours of substances with comparable structural features,
    considering mutagenic and not mutagenic neighbours. Based on the
    structural similarity, a probability for mutagenicity and no
    mutagenicity is calculated independently from each other (meaning
    that the sum of probabilities does not necessarily adds up to 100%).
    The class with the higher outcome is then the overall outcome for
    the substance.

> In contrast, the other models need to be trained first to recognise
> the structural features that are responsible for genotoxicity.
> Therefore, the mixture of substances being mutagenic with and without
> metabolic activation in the training dataset may have adversely
> affected the ability to separate the dataset in two distinct classes
> and thus explains the relatively low performance of these models.

2.  Machine learning algorithms try to find an optimized solution in a
    high-dimensional (one dimension per each predictor) space. Sometimes
    these methods do not find the global optimum of estimates but only
    local (not optimal) solutions. Strategies to find the global
    solutions are systematic variation (grid search) of the
    hyperparameters of the methods, which may be very time consuming in
    particular in large datasets.

Mutagenicity of PAs

Due to the low to moderate predictivity of all models, quantitative
statement on the genotoxicity of single PAs cannot be made with
sufficient confidence.

The predictions of the SVM model did not fit with the other models or
literature, and are therefore not further considered in the discussion.

Necic acid

The rank order of the necic acid is comparable in the four models
considered (LAZAR, RF and DL (R-project and TensorFlow). PAs from the
monoester type had the lowest genotoxic potential, followed by PAs from
the open-ring diester type. PAs with macrocyclic diesters had the
highest genotoxic potential. The result fit well with current state of
knowledge: in general, PAs, which have a macrocyclic diesters as necic
acid, are considered more toxic than those with an open-ring diester or
monoester [EFSA 2011](#_ENREF_36)[Fu et al. 2004](#_ENREF_45)[Ruan et
al. 2014b](#_ENREF_115)(; ; ).

Necine base

The rank order of necine base is comparable in LAZAR, RF, and DL
(R-project) models: with platynecine being less or as genotoxic as
retronecine, and otonecine being the most genotoxic. In the
TensorFlow-generate DL model, platynecine also has the lowest genotoxic
probability, but are then followed by the otonecines and last by
retronecine. These results partly correspond to earlier published
studies. Saturated PAs of the platynecine-type are generally accepted to
be less or non-toxic and have been shown in *in vitro* experiments to
form no DNA-adducts [Xia et al. 2013](#_ENREF_139)(). Therefore, it is
striking, that 1,2-unsaturated PAs of the retronecine-type should have
an almost comparable genotoxic potential in the LAZAR and DL (R-project)
model. In literature, otonecine-type PAs were shown to be more toxic
than those of the retronecine-type [Li et al. 2013](#_ENREF_80)().

Modifications of necine base

The group-specific results of the TensorFlow-generated DL model appear
to reflect the expected relationship between the groups: the low
genotoxic potential of *N*-oxides and the highest potential of
dehydropyrrolizidines [Chen et al. 2010](#_ENREF_26)().

In the LAZAR model, the genotoxic potential of dehydropyrrolizidines
(DHP) (using the extended AD) is comparable to that of tertiary PAs.
Since, DHP is regarded as the toxic principle in the metabolism of PAs,
and known to produce protein- and DNA-adducts [Chen et al.
2010](#_ENREF_26)(), the LAZAR model did not meet this expectation it
predicted the majority of DHP as being not genotoxic. However, the
following issues need to be considered. On the one hand, all DHP were
outside of the stricter AD of 0.5. This indicates that in general, there
might be a problem with the AD. In addition, DHP has two unsaturated
double bounds in its necine base, making it highly reactive. DHP and
other comparable molecules have a very short lifespan, and usually
cannot be used in *in vitro* experiments. This might explain the absence
of suitable neighbours in LAZAR.

Furthermore, the probabilities for this substance groups needs to be
considered, and not only the consolidated prediction. In the LAZAR
model, all DHPs had probabilities for both outcomes (genotoxic and not
genotoxic) mainly below 30%. Additionally, the probabilities for both
outcomes were close together, often within 10% of each other. The fact
that for both outcomes, the probabilities were low and close together,
indicates a lower confidence in the prediction of the model for DHPs.

In the DL (R-project) and RF model, *N*-oxides have a by far more
genotoxic potential that tertiary PAs or dehydropyrrolizidines. As PA
*N*-oxides are easily conjugated for extraction, they are generally
considered as detoxification products, which are *in vivo* quickly
renally eliminated [Chen et al. 2010](#_ENREF_26)(). On the other hand,
*N*-oxides can be also back-transformed to the corresponding tertiary PA
[Wang et al. 2005](#_ENREF_134)(). Therefore, it may be questioned,
whether *N*-oxides themselves are generally less genotoxic than the
corresponding tertiary PAs. However, in the groups of modification of
the necine base, dehydropyrrolizidine, the toxic principle of PAs,
should have had the highest genotoxic potential. Taken together, the
predictions of the modifications of the necine base from the LAZAR, RF
and R-generated DL model cannot -- in contrast to the TensorFlow DL
model - be considered as reliable.

Overall, when comparing the prediction results of the PAs to current
published knowledge, it can be concluded that the performance of most
models was low to moderate. This might be contributed to the following
issues:

1.  In the LAZAR model, only 26.6% PAs were within the stricter AD. With
    the extended AD, 92.3% of the PAs could be included in the
    prediction. Even though the Jaccard distance between the training
    dataset and the PA dataset for the RF, SVM, and DL (R-project and
    TensorFlow) models was small, suggesting a high similarity, the
    LAZAR indicated that PAs have only few local neighbours, which might
    adversely affect the prediction of the mutagenic potential of PAs.

2.  All above-mentioned models were used to predict the mutagenicity of
    PAs. PAs are generally considered to be genotoxic, and the mode of
    action is also known. Therefore, the fact that some models predict
    the majority of PAs as not genotoxic seems contradictory. To
    understand this result, the basis, the training dataset, has to be
    considered. The mutagenicity of in the training dataset are based on
    data of mutagenicity in bacteria. There are some studies, which show
    mutagenicity of PAs in the AMES test [Chen et al.
    2010](#_ENREF_26)(). Also, [Rubiolo et al. (1992)](#_ENREF_116)
    examined several different PAs and several different extracts of
    PA-containing plants in the AMES test. They found that the AMES test
    was indeed able to detect mutagenicity of PAs, but in general,
    appeared to have a low sensitivity. The pre-incubation phase for
    metabolic activation of PAs by microsomal enzymes was the
    sensitivity-limiting step. This could very well mean that this is
    also reflected in the QSAR models.

Conclusions
===========

In this study, an attempt was made to predict the genotoxic potential of
PAs using five different machine learning techniques (LAZAR, RF, SVM, DL
(R-project and TensorFlow). The results of all models fitted only partly
to the findings in literature, with best results obtained with the
TensorFlow DL model. Therefore, modelling allows statements on the
relative risks of genotoxicity of the different PA groups. Individual
predictions for selective PAs appear, however, not reliable on the
current basis of the used training dataset.

This study emphasises the importance of critical assessment of
predictions by QSAR models. This includes not only extensive literature
research to assess the plausibility of the predictions, but also a good
knowledge of the metabolism of the test substances and understanding for
possible mechanisms of toxicity.

In further studies, additional machine learning techniques or a modified
(extended) training dataset should be used for an additional attempt to
predict the genotoxic potential of PAs.

References
==========

[]{#_ENREF_4 .anchor}

[]{#_ENREF_8 .anchor}

[]{#_ENREF_17 .anchor}

[]{#_ENREF_18 .anchor}

[]{#_ENREF_20 .anchor}

[]{#_ENREF_26 .anchor}

[]{#_ENREF_33 .anchor}

[]{#_ENREF_36 .anchor}

[]{#_ENREF_38 .anchor}

[]{#_ENREF_39 .anchor}

[]{#_ENREF_45 .anchor}

[]{#_ENREF_48 .anchor}

[]{#_ENREF_57 .anchor}

[]{#_ENREF_59 .anchor}

[]{#_ENREF_63 .anchor}

[]{#_ENREF_65 .anchor}

[]{#_ENREF_71 .anchor}

[]{#_ENREF_76 .anchor}

[]{#_ENREF_78 .anchor}

[]{#_ENREF_80 .anchor}

[]{#_ENREF_82 .anchor}

[]{#_ENREF_99 .anchor}

[]{#_ENREF_104 .anchor}

<https://openbabel.org/docs/dev/Fingerprints/intro.html>

[]{#_ENREF_115 .anchor}

[]{#_ENREF_116 .anchor}

[]{#_ENREF_117 .anchor}

[]{#_ENREF_119 .anchor}

[]{#_ENREF_126 .anchor}

[]{#_ENREF_134 .anchor}

[]{#_ENREF_138 .anchor}

[]{#_ENREF_139 .anchor}

[]{#_ENREF_140 .anchor}

[]{#_ENREF_142 .anchor}

[]{#_ENREF_143
.anchor}<http://www.yapcwsoft.com/dd/padeldescriptor/Descriptors.xls>

[]{#_ENREF_148 .anchor}

Aguer C, Gambarotta D, Mailloux RJ, Moffat C, Dent R, et al. 2011.
Galactose enhances oxidative metabolism and reveals mitochondrial
dysfunction in human primary muscle cells. PLoS One 6:e28536Ahmed SN,
Siddiqi ZA. 2006. Antiepileptic drugs and liver disease. Seizure
15:156-64Aleo MD, Luo Y, Swiss R, Bonin PD, Potter DM, Will Y. 2014.
Human drug-induced liver injury severity is highly associated with dual
inhibition of liver mitochondrial function and bile salt export pump.
Hepatology (Baltimore, Md) 60:1015-22ANZFA. 2001. Pyrrolizidine
alkaloids in food. A Toxicological Review and Risk Assessment. ed.
Authority, ANZF, pp. 1-16Armstrong SJ, Zuckerman AJ, Bird RG. 1972.
Induction of morphological changes in human embryo liver cells by the
pyrrolizidine alkaloid lasiocarpine. British journal of experimental
pathology 53:145-9Barysz M, Jashari G, Lall RS, Srivastava AK,
Trinajstic N. 1983. On the distance matrix of molecules containing
heteroatoms. In *Chemical Applications of Topology and Graph Theory*,
pp. 222-30. Amsterdam, The Netherlands: ElsevierBasak SC, Harriss DK,
Magnuson VR. Comparative Study of Lipophilicity \<em\>versus\</em\>
Topological Molecular Descriptors in Biological Correlations. Journal of
Pharmaceutical Sciences 73:429-37Bender A, Mussa HY, Glen RC, Reiling S.
2004. Molecular similarity searching using atom environments,
information-based feature selection, and a naive Bayesian classifier. J
Chem Inf Comput Sci 44:170-8Benichou C, Danan G, Flahault A. 1993.
Causality assessment of adverse reactions to drugs\--II. An original
model for validation of drug causality assessment methods: case reports
with positive rechallenge. J Clin Epidemiol 46:1331-6Bergmeir C, Benítez
JM. 2012. Neural Networks in R Using the Stuttgart Neural Network
Simulator: RSNNS. Journal of Statistical Software 46:1-26Bishop-Bailey
D, Thomson S, Askari A, Faulkner A, Wheeler-Jones C. 2014.
Lipid-metabolizing CYPs in the regulation and dysregulation of
metabolism. Annu Rev Nutr 34:261-79Blower PE, Cross KP. 2006. Decision
Tree Methods in Pharmaceutical Research. Current topics in medicinal
chemistry 6:31-9Boelsterli UA, Lee KK. 2014. Mechanisms of
isoniazid-induced idiosyncratic liver injury: emerging role of
mitochondrial stress. Journal of gastroenterology and hepatology
29:678-87Bramer M. 2013. Principles of Data Mining. p. 444:
Springer-VerlagBreimann L. 2001. Random Forests. Machine Learning
45:5-32Breimann L. 2003. Manual-Setting Up, Using, And Understanding
Random Forests V4.0.1-33Bull LB, Dick AT. 1959. The chronic pathological
effects on the liver of the rat of the pyrrolizidine alkaloids
heliotrine, lasiocarpine and their N-oxides. J Path Bact 78:483-502Bull
LB, Dick AT, McKenzie JS. 1958. The actue toxic effects of heliotrine
and lasiocarpine, and their N-oxides, on the rat. J Path Bact
75:17-25Burden FR. 1989. Molecular identification number for
substructure searches. Journal of Chemical Information and Computer
Sciences 29:225-7Butler WH, Mattocks AR, Barnes JM. 1970. Lesions in the
liver and lungs of rats given pyrrole derivates of pyrrolizidine
alkaloids. J Path 100:169-75Chai J, He Y, Cai SY, Jiang Z, Wang H, et
al. 2012. Elevated hepatic multidrug resistance-associated protein
3/ATP-binding cassette subfamily C 3 expression in human obstructive
cholestasis is mediated through tumor necrosis factor alpha and c-Jun
NH2-terminal kinase/stress-activated protein kinase-signaling pathway.
Hepatology 55:1485-94Chalhoub WM, Sliman KD, Arumuganathan M, Lewis JH.
2014. Drug-induced liver injury: what was new in 2013? Expert Opin Drug
Metab Toxicol 10:959-80Chawla NV, Bowyer KW, Hall LO. 2002. SMOTE:
Synthetic Minority Over-sampling Technique. Journal of Artificial
Intelligence Research 16:321--57Chen M, Borlak J, Tong W. 2013. High
lipophilicity and high daily dose of oral medications are associated
with significant risk for drug-induced liver injury. Hepatology
(Baltimore, Md) 58:388-96Chen M, Suzuki A, Thakkar S, Yu K, Hu C, Tong
W. 2016. DILIrank: the largest reference drug list ranked by the risk
for developing drug-induced liver injury in humans. Drug Discov Today
21:648-53Chen T, Mei N, Fu PP. 2010. Genotoxicity of pyrrolizidine
alkaloids. J Appl Toxicol 30:183-96Crabtree HG. 1928. The carbohydrate
metabolism of certain pathological overgrowths Biochem J 22:1289-98Daly
AK, Donaldson PT, Bhatnagar P, Shen Y, Pe\'er I, et al. 2009.
HLA-B\*5701 genotype is a major determinant of drug-induced liver injury
due to flucloxacillin. Nature genetics 41:816-9Danan G, Benichou C.
1993. Causality assessment of adverse reactions to drugs\--I. A novel
method based on the conclusions of international consensus meetings:
application to drug-induced liver injuries. J Clin Epidemiol
46:1323-30Dar AC, Shokat KM. 2011. The evolution of protein kinase
inhibitors from antagonists to agonists of cellular signaling. Annu Rev
Biochem 80:769-95de Wildt SN, Kearns GL, Leeder JS, van den Anker JN.
1999. Cytochrome P450 3A: ontogeny and drug disposition. Clin
Pharmacokinet 37:485-505DeLeve LD, Ito Y, Bethea NW, McCuskey MK, Wang
X, McCuskey RS. 2003. Embolization by sinusoidal lining cells obstructs
the microcirculation in rat sinusoidal obstruction syndrome. Am J
Physiol Gastrointest Liver Physiol 284:G1045--G52DeLeve LD, Wang X,
Kuhlenkamp JF, Kaplowitz N. 1996. Toxicity of Azathioprine and
Monocrotaline in Murine Sinusoidal Endothelial Cells and Hepatocytes:
The Role of Glutathione and Relevance to Hepatic Venoocclusive Disease.
Hepatology 23:589-99Dong H, Haining RL, Thummel KE, Rettie AE, Nelson
SD. 2000. Involvement of human cytochrome P450 2D6 in the bioactivation
of acetaminophen. Drug Metab Dispos 28:1397-400Doostdar H, Grant MH,
Melvin WT, Wolf CR, Burke MD. 1993. The effects of inducing agents on
cytochrome P450 and UDP-glucuronyltransferase activities in human HEPG2
hepatoma cells. Biochemical pharmacology 46:629-35EFSA. 2011. Scientific
Opinion on Pyrrolizidine alkaloids in food and feed. EFSA Journal
9:1-134Ekins S, Williams AJ, Xu JJ. 2010. A predictive ligand-based
Bayesian model for human drug-induced liver injury. Drug Metab. Dispos.
38:2302-8EMA. 2014. EMA/HMPC/893108/2011: Public statement on the use of
herbal medicinal products containing toxic, unsaturated pyrrolizidine
alkaloids (PAs).1-24EMA. 2016. EMA/HMPC/328782/2016: Public statement on
contamination of herbal medicinal products/traditional herbal medicinal
products with pyrrolizidine alkaloids.1-11Fashe MM, Juvonen RO, Petsalo
A, Vepsalainen J, Pasanen M, Rahnasto-Rilla M. 2015. In silico
prediction of the site of oxidation by cytochrome P450 3A4 that leads to
the formation of the toxic metabolites of pyrrolizidine alkaloids. Chem
Res Toxicol 28:702-10Field RA, Stegelmeier BL, Colegate SM, Brown AW,
Green BT. 2015. An in vitro comparison of the cytotoxic potential of
selected dehydropyrrolizidine alkaloids and some N-oxides. Toxicon
97:36-45Fleming I. 2014. The pharmacology of the cytochrome P450
epoxygenase/soluble epoxide hydrolase axis in the vasculature and
cardiovascular disease. Pharmacol Rev 66:1106-40Fonti V. 2017. *Feature
Selection using LASSO*. Research paper. VU Amsterdam. 26 pp.Fu PP, Chou
MW, Churchwell M, Wang Y, Zhao Y, et al. 2010. High-Performance Liquid
Chromatography Electrospray Ionization Tandem Mass Spectrometry for the
Detection and Quantitation of Pyrrolizidine Alkaloid-Derived DNA Adducts
in Vitro and in Vivo. Chem Res Toxicol 23:637--52Fu PP, Xia Q, Lin G,
Chou MW. 2004. Pyrrolizidine alkaloids\--genotoxicity, metabolism
enzymes, metabolic activation, and mechanisms. Drug Metab Rev
36:1-55Galeotti N, Vivoli E, Bilia AR, Vincieri FF, Ghelardini C. 2010.
St. John\'s wort reduces neuropathic pain through a hypericin-mediated
inhibition of the protein kinase Cgamma and epsilon activity. Biochem
Pharmacol 79:1327-36Ganesan S, Tekwani BL, Sahu R, Tripathi LM, Walker
LA. 2009. Cytochrome P(450)-dependent toxic effects of primaquine on
human erythrocytes. Toxicol Appl Pharmacol 241:14-22Gao H, Ruan JQ, Chen
J, Li N, Ke CQ, et al. 2015. Blood pyrrole-protein adducts as a
diagnostic and prognostic index in pyrrolizidine alkaloid-hepatic
sinusoidal obstruction syndrome. Drug Des Devel Ther 9:4861-8Gitlin N.
1980. Salicylate hepatotoxicity: the potential role of hypoalbuminemia.
J Clin Gastroenterol 2:281-5Gordon GJ, Coleman WB, Grisham JW. 2000.
Bax-mediated apoptosis in the livers of rats after partial hepatectomy
in the retrorsine model of hepatocellular injury. Hepatology
32:312-20Gradhand U, Lang T, Schaeffeler E, Glaeser H, Tegude H, et al.
2008. Variability in human hepatic MRP4 expression: influence of
cholestasis and genotype. Pharmacogenomics J 8:42-52Gramatica P, Corradi
M, Consonni V. 2000. Modelling and prediction of soil sorption
coefficients of non-ionic organic pesticides by molecular descriptors.
Chemosphere 41:763-77Greene N, Fisk L, Naven RT, Note RR, Patel ML,
Pelletier DJ. 2010. Developing structure-activity relationships for the
prediction of hepatotoxicity. Chemical Research in Toxicology
23:1215-22Guo YX, Xu XF, Zhang QZ, Li C, Deng Y, et al. 2015. The
inhibition of hepatic bile acids transporters Ntcp and Bsep is involved
in the pathogenesis of isoniazid/rifampicin-induced hepatotoxicity.
Toxicology mechanisms and methods 25:382-7Hall LH, Kier LB. 1995.
Electrotopological State Indices for Atom Types: A Novel Combination of
Electronic, Topological, and Valence State Information. Journal of
Chemical Information and Computer Sciences 35:1039-45Hammann F, Schoning
V, Drewe J. 2018. Prediction of clinically relevant drug-induced liver
injury from structure using machine learning. J Appl Toxicol Hansen K,
Mika S, Schroeter T, Sutter A, ter Laak A, et al. 2009. Benchmark data
set for in silico prediction of Ames mutagenicity. J Chem Inf Model
49:2077-81Hartmann T, Ehmke A, Eilert U, yon Borstel K, Thcuring C.
1989. Sites of synthesis, translocation and accumulation of
pyrrolizidine alkaloid N-oxides in Senecio vulgaris L. Planta
177:98-107Hartmann T, Witte L. 1995. Chemistry, Biology and Chemoecology
of the Pyrrolizidine Alkaloids. In *Alkaloids: Chemical and Biological
Perspectives*, ed. Pelletier, pp. 155-233. Pergamon, London, New
YorkHessel S, Gottschalk C, Schumann D, These A, Preiss-Weigert A,
Lampen A. 2014. Structure-activity relationship in the passage of
different pyrrolizidine alkaloids through the gastrointestinal barrier:
ABCB1 excretes heliotrine and echimidine. Mol Nutr Food Res
58:995-1004Hunt CM, Westerkam WR, Stave GM. 1992. Effect of age and
gender on the activity of human hepatic CYP3A. Biochemical pharmacology
44:275-83Ibanez L, Perez E, Vidal X, Laporte JR, Grup d\'Estudi
Multicenteric d\'Hepatotoxicitat Aguda de B. 2002. Prospective
surveillance of acute serious liver disease unrelated to infectious,
obstructive, or metabolic diseases: epidemiological and clinical
features, and exposure to drugs. J Hepatol 37:592-600ICH. 2011.
Guideance on genotoxicity testing and data interpretation for
pharmaceuticals intended for human use S2(R1). p. 29Iyer VV, Yang H,
Ierapetritou MG, Roth CM. 2010. Effects of glucose and insulin on
HepG2-C3A cell metabolism. Biotechnol Bioeng 107:347-56Jago MV. 1971.
Factors affecting the chronic hepatotoxicity of pyrrolizidine alkaloids.
The Journal of Pathology 105:1-11Jeon JY, Sparreboom A, Baker SD. 2017.
Kinase Inhibitors: The Reality Behind the Success. Clin Pharmacol Ther
102:726-30Jeong W, Doroshow JH, Kummar S. 2013. United States Food and
Drug Administration approved oral kinase inhibitors for the treatment of
malignancies. Curr Probl Cancer 37:110-44Ji L, Chen Y, Liu T, Wang Z.
2008. Involvement of Bcl-xL degradation and mitochondrial-mediated
apoptotic pathway in pyrrolizidine alkaloids-induced apoptosis in
hepatocytes. Toxicol Appl Pharmacol 231:393-400Jornil J, Nielsen TS,
Rosendal I, Ahlner J, Zackrisson AL, et al. 2013. A poor metabolizer of
both CYP2C19 and CYP2D6 identified by mechanistic pharmacokinetic
simulation in a fatal drug poisoning case involving venlafaxine.
Forensic Sci Int 226:e26-31Kalthoff S, Ehmer U, Freiberg N, Manns MP,
Strassburg CP. 2010. Interaction between oxidative stress sensor Nrf2
and xenobiotic-activated aryl hydrocarbon receptor in the regulation of
the human phase II detoxifying UDP-glucuronosyltransferase 1A10. J Biol
Chem 285:5993-6002Kazius J, McGuire R, Bursi R. 2005. Derivation and
validation of toxicophores for mutagenicity prediction. J Med Chem
48:312-20Khan D, Khan AU. 2016. Descriptors and their selection methods
in QSAR analysis: paradigm for drug design. Drug Discov Today
21:1291-302Kim HY, Stermitz FR, Molyneux RJ, Wilson DW, Taylor D,
Coulombe RA, Jr. 1993. Structural influences on pyrrolizidine
alkaloid-induced cytopathology. Toxicol Appl Pharmacol 122:61-9Kock K,
Ferslew BC, Netterberg I, Yang K, Urban TJ, et al. 2014. Risk factors
for development of cholestatic drug-induced liver injury: inhibition of
hepatic basolateral bile acid transporters multidrug
resistance-associated proteins 3 and 4. Drug Metab Dispos
42:665-74Lammert C, Einarsson S, Saha C, Niklasson A, Bjornsson E,
Chalasani N. 2008. Relationship between daily dose of oral medications
and idiosyncratic drug-induced liver injury: search for signals.
Hepatology 47:2003-9Langel D, Ober D, Pelser PB. 2011. The evolution of
pyrrolizidine alkaloid biosynthesis and diversity in the Senecioneae.
Phytochemistry Reviews 10:3-74Lasser KE, Allen PD, Woolhandler SJ,
Himmelstein DU, Wolfe SM, Bor DH. 2002. Timing of new black box warnings
and withdrawals for prescription medications. JAMA 287:2215-20Li N, Xia
Q, Ruan J, Fu PP, Lin G. 2011. Hepatotoxicity and Tumorigenicity Induced
by Metabolic Activation of Pyrrolizidine Alkaloids in Herbs. Current
Drug Metabolism 12Li X, Cameron MD. 2012. Potential role of a quetiapine
metabolite in quetiapine-induced neutropenia and agranulocytosis. Chem
Res Toxicol 25:1004-11Li YH, Kan WL, Li N, Lin G. 2013. Assessment of
pyrrolizidine alkaloid-induced toxicity in an in vitro screening model.
J Ethnopharmacol 150:560-7Lima A, Bernardes M, Azevedo R, Medeiros R,
Seabra V. 2015. Pharmacogenomics of Methotrexate Membrane Transport
Pathway: Can Clinical Response to Methotrexate in Rheumatoid Arthritis
Be Predicted? Int J Mol Sci 16:13760-80Lin G. 1998. Microsomal Formation
of a Pyrrolic Alcohol Glutathione Conjugate of ClivorineFirm Evidence
for the Formation of a Pyrrolic Metabolite of an Otonecine-Type
Pyrrolizidine Alkaloid. Drug Metabolism and Disposition
26:181-4Lindigkeit R, Biller A, Buch M, Schiebel H-M, Boppré M, Hartmann
T. 1997. The two faces of pyrrolizidine alkaloids: the role of the
tertiary amine and its N-oxide in chemical defense of insects with
acquired plant alkaloids. Eur J Biochem 245Makhlouf HA, Helmy A, Fawzy
E, El-Attar M, Rashed HA. 2008. A prospective study of antituberculous
drug-induced hepatotoxicity in an area endemic for liver diseases.
Hepatol Int 2:353-60Marin-Hernandez A, Rodriguez-Enriquez S,
Vital-Gonzalez PA, Flores-Rodriguez FL, Macias-Silva M, et al. 2006.
Determining and understanding the control of glycolysis in fast-growth
tumor cells. Flux control by an over-expressed but strongly
product-inhibited hexokinase. FEBS J 273:1975-88Marroquin LD, Hynes J,
Dykens JA, Jamieson JD, Will Y. 2007. Circumventing the Crabtree effect:
replacing media glucose with galactose increases susceptibility of HepG2
cells to mitochondrial toxicants. Toxicol Sci 97:539-47Mattocks AR.
1986. *Chemistry and Toxicology of Pyrrolizidine Alkaloids*: Academic
PressMeharena HS, Chang P, Keshwani MM, Oruganty K, Nene AK, et al.
2013. Deciphering the structural basis of eukaryotic protein kinase
regulation. PLoS Biol 11:e1001680Merz KH, Schrenk D. 2016. Interim
relative potency factors for the toxicological risk assessment of
pyrrolizidine alkaloids in food and herbal medicines. Toxicol Lett
263:44-57Miners JO, Birkett DJ. 1998. Cytochrome P4502C9: an enzyme of
major importance in human drug metabolism. British Journal of Clinical
Pharmacology 45:525-38Mingard C, Paech F, Bouitbir J, Krahenbuhl S.
2018. Mechanisms of toxicity associated with six tyrosine kinase
inhibitors in human hepatocyte cell lines. J Appl Toxicol
38:418-31Mingatto FE, Dorta DJ, dos Santos AB, Carvalho I, da Silva CH,
et al. 2007. Dehydromonocrotaline inhibits mitochondrial complex I. A
potential mechanism accounting for hepatotoxicity of monocrotaline.
Toxicon 50:724-30Mitchell JB. 2014. Machine learning methods in
chemoinformatics. Wiley Interdiscip Rev Comput Mol Sci 4:468-81Morgan
RE, Trauner M, van Staden CJ, Lee PH, Ramachandran B, et al. 2010.
Interference with bile salt export pump function is a susceptibility
factor for human liver injury in drug development. Toxicol Sci
118:485-500Muegge I, Mukherjee P. 2016. An overview of molecular
fingerprint similarity search in virtual screening. Expert Opin Drug
Discov 11:137-48Najibi A, Heidari R, Zarifi J, Jamshidzadeh A,
Firoozabadi N, Niknahad H. 2016. Evaluating the Role of Drug Metabolism
and Reactive Intermediates in Trazodone-Induced Cytotoxicity toward
Freshly-Isolated Rat Hepatocytes. Drug Res (Stuttg) 66:592-6Nantasenamat
C, Isarankura-Na-Ayudhya C, Naenna T, Prachayasittikul V. 2009. A
Practical Overview of Quantitative Structure-Activity Relationship.
EXCLI Journal 8:74-88National Cancer Institute. 2006. Common Terminology
Criteria for Adverse Events v3.0 (CTCAE). ed. Program, CTENeumann MG,
Cohen LB, Opris M, Nanau R, Jeong H. 2015. Hepatotoxicity of
Pyrrolizidine Alkaloids. J Pharm Pharm Sci 18:825-43Newby D, Freitas AA,
Ghafourian T. 2015. Decision trees to characterise the roles of
permeability and solubility on the prediction of oral absorption. Eur J
Med Chem 90:751-65Niederer C, Behra R, Harder A, Schwarzenbach RP,
Escher BI. 2004. Mechanistic approaches for evaluating the toxicity of
reactive organochlorines and epoxides in green algae. Environmental
Toxicology and Chemistry 23:697-704NTP. 1978. Bioassay of lasiocarpine
for possible carcinogenicity. pp. 1-82NTP. 2003. Toxicology and
Carcinogenesis Studies of Riddelliine (CAS No. 23246-96-0) in F344/N
Rats And B6c3F~1~ Mice (Gavage Studies). ed. Health, NIoO\'Boyle NM,
Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. 2011. Open
Babel: An open chemical toolbox. J Cheminform 3:33Open Babel community.
2011. *Molecular fingerprints and similarity searching --- Open Babel
v2.3.1 documentation. Openbabel.org*. , Dececmber 31, 2018Paech F,
Bouitbir J, Krahenbuhl S. 2017. Hepatocellular Toxicity Associated with
Tyrosine Kinase Inhibitors: Mitochondrial Damage and Inhibition of
Glycolysis. Front Pharmacol 8:367Parkinson A, Mudra DR, Johnson C, Dwyer
A, Carroll KM. 2004. The effects of gender, age, ethnicity, and liver
cirrhosis on cytochrome P450 enzyme activity in human liver microsomes
and inducibility in cultured human hepatocytes. Toxicol Appl Pharmacol
199:193-209Pellinen P, Honkakoski P, Stenback F, Niemitz M, Alhava E, et
al. 1994. Cocaine N-demethylation and the metabolism-related
hepatotoxicity can be prevented by cytochrome P450 3A inhibitors. Eur J
Pharmacol 270:35-43Regev A, Seeff LB, Merz M, Ormarsdottir S, Aithal GP,
et al. 2014. Causality assessment for suspected DILI during clinical
phases of drug development. Drug Saf 37 Suppl 1:S47-56Rendic S. 2002.
Summary of information on human CYP enzymes: human P450 metabolism data.
Drug Metab Rev 34:83-448Reuben A, Koch DG, Lee WM, Acute Liver Failure
Study G. 2010. Drug-induced acute liver failure: results of a U.S.
multicenter, prospective study. Hepatology 52:2065-76Rodrigues AC. 2010.
Efflux and uptake transporters as determinants of statin response.
Expert Opin Drug Metab Toxicol 6:621-32Roskoski R, Jr. 2015. A
historical overview of protein kinases and their targeted small molecule
inhibitors. Pharmacol Res 100:1-23Ruan J, Liao C, Ye Y, Lin G. 2014a.
Lack of metabolic activation and predominant formation of an excreted
metabolite of nontoxic platynecine-type pyrrolizidine alkaloids. Chem
Res Toxicol 27:7-16Ruan J, Yang M, Fu P, Ye Y, Lin G. 2014b. Metabolic
activation of pyrrolizidine alkaloids: insights into the structural and
enzymatic basis. Chem Res Toxicol 27:1030-9Rubiolo P, Pieters L, Calomme
M, Bicchi C, Vlietinck A, Vanden Berghe D. 1992. Mutagenicity of
pyrrolizidine alkaloids in the Salmonella typhimurium/mammalian
microsome system. Mutat Res 281:143-7Rücker C, Rücker G, Meringer M.
2007. y-Randomization and Its Variants in QSPR/QSAR. J. Chem. Inf.
Model. 47:2345-57Schoental R, Head MA. 1957. Progression of liver
lesions produced in rats by temporary treatment with pyrrolizidine
(senecio) alkaloids, and the effects of betaine and high casein diet. Br
J Cancer 11:535-44Schöning V, Hammann F, Peinl M, Drewe J. 2017.
Editor\'s Highlight: Identification of Any Structure-Specific
Hepatotoxic Potential of Different Pyrrolizidine Alkaloids Using Random
Forests and Artificial Neural Networks. Toxicol Sci 160:361-70Shah RR,
Morganroth J, Shah DR. 2013. Hepatotoxicity of tyrosine kinase
inhibitors: clinical and regulatory perspectives. Drug Saf
36:491-503Spjuth O, Alvarsson J, Berg A, Eklund M, Kuhn S, et al. 2009.
Bioclipse 2: A scriptable integration platform for the life sciences.
BMC Bioinformatics 10:1-5Spjuth O, Helmus T, Willighagen EL, Kuhn S,
Eklund M, et al. 2007. Bioclipse: an open source workbench for chemo-
and bioinformatics. BMC Bioinformatics 8:1-10Srinivas N, Sandeep KS,
Anusha Y, Devendra BN. 2014. In Vitro Cytotoxic Evaluation and
Detoxification of Monocrotaline (Mct) Alkaloid: An In Silico Approach.
International Invention Journal of Biochemistry and Bioinformatics
2:20-9Stine JG, Chalasani NP. 2017. Drug Hepatotoxicity: Environmental
Factors. Clin Liver Dis 21:103-13Stine JG, Lewis JH. 2011. Drug-induced
liver injury: a summary of recent advances. Expert Opin Drug Metab
Toxicol 7:875-90Takanashi H, Umeda M, Hirono I. 1980. Chromosomal
aberrations and mutations in cultured mammalidan cells induced by
pyrrolizidine alkaloids. Mutation Research 78:67-77Takeda M, Okamoto I,
Nakagawa K. 2015. Pooled safety analysis of EGFR-TKI treatment for EGFR
mutation-positive non-small cell lung cancer. Lung Cancer 88:74-9Tamta
H, Pawar RS, Wamer WG, Grundel E, Krynitsky AJ, Rader JI. 2012.
Comparison of metabolism-mediated effects of pyrrolizidine alkaloids in
a HepG2/C3A cell-S9 co-incubation system and quantification of their
glutathione conjugates. Xenobiotica 42:1038-48Teh LK, Bertilsson L.
2012. Pharmacogenomics of CYP2D6: molecular genetics, interethnic
differences and clinical importance. Drug Metab Pharmacokinet
27:55-67Teo YL, Ho HK, Chan A. 2013. Risk of tyrosine kinase
inhibitors-induced hepatotoxicity in cancer patients: a meta-analysis.
Cancer Treat Rev 39:199-206Teo YL, Ho HK, Chan A. 2015. Formation of
reactive metabolites and management of tyrosine kinase inhibitor-induced
hepatotoxicity: a literature review. Expert Opin Drug Metab Toxicol
11:231-42Thompson RA, Isin EM, Ogese MO, Mettetal JT, Williams DP. 2016.
Reactive Metabolites: Current and Emerging Risk and Hazard Assessments.
Chem Res Toxicol 29:505-33Walker K, Ginsberg G, Hattis D, Johns DO,
Guyton KZ, Sonawane B. 2009. Genetic polymorphism in N-Acetyltransferase
(NAT): Population distribution of NAT1 and NAT2 activity. Journal of
toxicology and environmental health. Part B, Critical reviews
12:440-72Wang YP, Yan J, Fu PP, Chou MW. 2005. Human liver microsomal
reduction of pyrrolizidine alkaloid N-oxides to form the corresponding
carcinogenic parent alkaloid. Toxicol Lett 155:411-20Weininger D. 1988.
SMILES, a chemical language and information system. 1. Introduction to
methodology and encoding rules. J Chem Inf Comput Sci 28:31-6Westerink
WM, Schoonen WG. 2007. Phase II enzyme levels in HepG2 cells and
cryopreserved primary human hepatocytes and their induction in HepG2
cells. Toxicol In Vitro 21:1592-602Wu P, Nielsen TE, Clausen MH. 2015.
FDA-approved small-molecule kinase inhibitors. Trends Pharmacol Sci
36:422-39Xia Q, Ma L, He X, Cai L, Fu PP. 2015. 7-glutathione pyrrole
adduct: a potential DNA reactive metabolite of pyrrolizidine alkaloids.
Chem Res Toxicol 28:615-20Xia Q, Zhao Y, Von Tungeln LS, Doerge DR, Lin
G, et al. 2013. Pyrrolizidine alkaloid-derived DNA adducts as a common
biological biomarker of pyrrolizidine alkaloid-induced tumorigenicity.
Chem Res Toxicol 26:1384-96Yan J, Xia Q, Chou MW, Fu P. 2008. Metabolic
activation of retronecine and retronecine N-oxide -- formation of
DHP-derived DNA adducts. Toxicology and Industrial Health 24Yang X, Li
W, Sun Y, Guo X, Huang W, et al. 2017. Comparative Study of
Hepatotoxicity of Pyrrolizidine Alkaloids Retrorsine and Monocrotaline.
Chem Res Toxicol 30:532-9Yap CW. 2011. PaDEL-descriptor: an open source
software to calculate molecular descriptors and fingerprints. Journal of
computational chemistry 32:1466-74Yap CW. 2014. *Descriptors*. ,
27.10.2016Yu K, Geng X, Chen M, Zhang J, Wang B, et al. 2014a. High
daily dose and being a substrate of cytochrome P450 enzymes are two
important predictors of drug-induced liver injury. Drug Metab Dispos
42:744-50Yu K, Geng X, Chen M, Zhang J, Wang B, et al. 2014b. High daily
dose and being a substrate of cytochrome P450 enzymes are two important
predictors of drug-induced liver injury. Drug Metab. Dispos.
42:744-50Zanger UM, Turpeinen M, Klein K, Schwab M. 2008. Functional
pharmacogenetics/genomics of human cytochromes P450 involved in drug
biotransformation. Anal Bioanal Chem 392:1093-108Zhang J, Sheng Y, Shi
L, Zheng Z, Chen M, et al. 2017. Quercetin and baicalein suppress
monocrotaline-induced hepatic sinusoidal obstruction syndrome in rats.
Eur J Pharmacol 795:160-8Zhao Y, Xia Q, Gamboa da Costa G, Yu H, Cai L,
Fu PP. 2012. Full structure assignments of pyrrolizidine alkaloid DNA
adducts and mechanism of tumor initiation. Chem Res Toxicol
25:1985-96Zheng Z, Shi L, Sheng Y, Zhang J, Lu B, Ji L. 2016.
Chlorogenic acid suppresses monocrotaline-induced sinusoidal obstruction
syndrome: The potential contribution of NFkappaB, Egr1, Nrf2, MAPKs and
PI3K signals. Environ Toxicol Pharmacol 46:80-9Zhu XW, Xin YJ, Ge HL.
2015. Recursive Random Forests Enable Better Predictive Performance and
Model Interpretation than Variable Selection by LASSO. J Chem Inf Model
55:736-46