Computational Chemistry, Contributed Talk (15min)
CC-014

Predicting chemical toxicity through machine learning

C. Schür1, L. Gasser2, J. Wu3, F. Perez-Cruz2, K. Schirmer1*, M. Baity-Jesi3*
1Department of Environmental Toxicology, Eawag, Swiss Federal Institute of Aquatic Science and Technology, 2Swiss Data Science Center, 3Department of Systems Analysis, Integrated Assessment and Modelling, Eawag, Swiss Federal Institute of Aquatic Science and Technology

The global landscape of registered chemicals on the market is growing at a fast pace. Meanwhile, the regulation of compounds requires extensive animal testing to ensure their safety for both the public and the environment. Beyond its ethical implications, animal testing is labor-, cost-, and time-intensive. Here, new approach methods can be a remedy by reducing, if not ultimately replacing, conventional animal tests. An increase in computational power and accessibility of machine learning methods enable the use of in silico methods for ecotoxicological questions such as the hazard assessment of chemicals. Our research is aimed at accurately predicting the toxicity of chemicals to a wide range of taxa beyond model organisms and to explore its potential and limitations. For this, we assembled an extensive dataset of ecotoxicity data across three trophic levels (fish, crustaceans, and algae) with over 1000 species. This core dataset was enhanced by adding species-specific and phylogenetic information and by comparing several ways to represent chemical structure and properties (molecular fingerprints, mol2vec). We explore the importance of adding certain feature groups and put their individual relevance in the context of biology to better understand the drivers of toxicity as well as what potential supplementary experiments could help reduce cost and harm.