digital face

Collaborative filtering for species-chemicals toxicology prediction

This PhD project aims to advance the use of machine learning models, particularly Graph Neural Networks (GNNs) and Large Language Models (LLMs), for predicting the toxicity of pesticides and fertilisers in agriculture.

Project Outline

Graph Neural Networks [1] have been recently proven to provide an alternative to in vivo fertilisers/pesticides testing by modeling chemical-species interactions by capturing LC50 concentrations and their effect on individual species [2]. In the context of agriculture, knowing which pesticides [4] are lethal or poisonous or dangerous to various species, their growth and human consumption could improve productivity. However, while the first model [1] is restricted to aquatic species, broad set of chemicals use simply fingerprints and species use word embeddings, Large Language Models (LLMs) and Foundation Models (FMs) open up an exciting opportunity to leverage large scale knowledge graphs between graph digital signatures of pesticides, their lingual digital context, digitally modeled animal physiological context and lingual digital context. Relationship with other species in the food chain can be also modeled. Leveraging hyper-edge collaborative models [3] and recent advances in transformers, this PhD project will go beyond the early proof of concept and provide machine learning models that with high accuracy can recommend a small set of chemicals for in vivo testing by quantifying toxicity and uncertainty. We expect publications in top-tier ML venues (Neurips, ICML, ICLR, AAAI) and a possibility of domain specific publications (Nature etc.) should the model lead to practical high-level discoveries.

[1] Graph neural networks-enhanced relation prediction for ecotoxicology (GRAPE), Journal of Hazardous Materials 2024
[2] Creation of a Curated Aquatic Toxicology Database: EnviroTox, Environmental Toxicology and Chemistry 2019
[3] Mitigating the Popularity Bias in Graph-based Collaborative Filtering, NeurIPS 2023
[4] Systematic approaches to machine learning models for predicting pesticide toxicity, Heliyon 2024

The student will:
- get familiar with SOTA collaborative filtering
- learn domain knowledge in terms of pesticides-species modeling (how to represent relevant biological and chemical traits)
- learn latest LLM/FM models and develop methodology to apply them in the context of collaborative filtering
- learn and develop new collaborative models that can leverage complex relation graphs modeling multiple sources of information
- become an expert on digital pesticide/fertiliser modeling
- learn PyTorch and other key deep learning packages

To register an expression of interest, click here. You will need to outline why you have selected the research project and how your skills, experience and/or knowledge meet the project requirements.