Predictability of drug-induced liver injury by machine learning

Background Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massi...

Full description

Saved in:

Bibliographic Details
Published in	Biology direct Vol. 15; no. 1; pp. 3 - 10
Main Authors	Chierici, Marco, Francescatto, Margherita, Bussola, Nicole, Jurman, Giuseppe, Furlanello, Cesare
Format	Journal Article
Language	English
Published	London BioMed Central 13.02.2020 BioMed Central Ltd Springer Nature B.V BMC
Subjects	Accuracy Adverse drug reactions Algorithms Bioinformatics Biomedical and Life Sciences Biotechnology Cancer Classification CMap Correlation coefficient Correlation coefficients Data analysis Deep learning Diagnosis DILI Drug development Experiments Gene expression Health risk assessment Hepatotoxicity Information management Learning algorithms Life Sciences Liver Liver diseases Machine learning Microarray Multilayers Pharmacovigilance Proceedings of the Critical Assessment of Massive Data Analysis (CAMDA) Satellite Meeting to ISMB 2018 Product safety Risk factors Tumor cell lines Italy United States > US Deep learning DILI Microarray CMap Classification
Online Access	Get full text
ISSN	1745-6150 1745-6150
DOI	10.1186/s13062-020-0259-4

Cover

More Information
Summary:	Background Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis group proposed the CMap Drug Safety challenge focusing on DILI prediction. Methods and results The challenge data included Affymetrix GeneChip expression profiles for the two cancer cell lines MCF7 and PC3 treated with 276 drug compounds and empty vehicles. Binary DILI labeling and a recommended train/test split for the development of predictive classification approaches were also provided. We devised three deep learning architectures for DILI prediction on the challenge data and compared them to random forest and multi-layer perceptron classifiers. On a subset of the data and for some of the models we additionally tested several strategies for balancing the two DILI classes and to identify alternative informative train/test splits. All the models were trained with the MAQC data analysis protocol (DAP), i.e. , 10x5 cross-validation over the training set. In all the experiments, the classification performance in both cross-validation and external validation gave Matthews correlation coefficient (MCC) values below 0.2. We observed minimal differences between the two cell lines. Notably, deep learning approaches did not give an advantage on the classification performance. Discussion We extensively tested multiple machine learning approaches for the DILI classification task obtaining poor to mediocre performance. The results suggest that the CMap expression data on the two cell lines MCF7 and PC3 are not sufficient for accurate DILI label prediction. Reviewers This article was reviewed by Maciej Kandula and Paweł P. Labaj.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1745-6150 1745-6150
DOI:	10.1186/s13062-020-0259-4