Advancing Untargeted Metabolomics through a Probabilistic and Context-Aware Annotation Pipeline
About the Project
This project aims to develop a revolutionary AI-powered, "context-aware" pipeline to automate metabolite annotation in untargeted metabolomics data. By pioneering LLMs and Bayesian statistics, this project will transform complex metabolomics data into biological breakthroughs.
Metabolite annotation is one of the most pressing challenges in untargeted metabolomics data analysis. Current annotation tools often rely on simple mass-matching against static databases, leading to high false-positive rates. This project builds upon the Integrated Probabilistic Annotation (IPA) framework (Del Carratore et al., 2019; Del Carratore et al., 2023) to move beyond simple matching toward a "context-aware" system.
The PhD candidate will lead three key objectives:
- AI-Driven Database Curation: You will utilize Large Language Models (LLMs) to mine scientific literature and existing repositories to create a "context-aware" database. This database will encode vital metadata like retention times and biological likelihood to filter out false positives.
- Platform Expansion: You will extend the computational framework to integrate data from emerging analytical platforms beyond LC-MS, including GC-MS, Ion Mobility-MS, and MALDI.
- Software Engineering & GUI: To ensure global community adoption, you will develop a user-friendly Graphical User Interface (GUI), empowering non-bioinformaticians to utilize these advanced probabilistic methods.
Training and Collaboration
You will be embedded in Dr Del Carratore Lab focusing on Bioinformatics and Computational Biology. Moreover, you will closely collaborate with two a world-class research facility at the University of Liverpool, benefiting from a unique dual academic setting:
- Computational Biology Facility (CBF): You will work within the CBF to develop high-quality code and AI models, gaining expertise in software engineering and LLM implementation.
- Centre for Metabolomics Research (CMR): You will have direct access to data coming from state-of-the-art analytical platforms to generate and validate experimental data. Prof. Warwick Dunn will provide mentorship on analytical chemistry aspects and user requirements.
Project Structure
The 3.5-year PhD is designed to transition you from a trainee to an independent leader in computational biology:
- Year 1: Foundation and Advanced Training. Your first year focuses on mastering the computational skillsets required for the project, including bioinformatics, Bayesian statistics, and AI/LLM implementation. You will begin the initial development of the "context-aware" database by mining existing repositories.
- Years 2-3: Implementation and Engagement. During this period, you will move into independent research, expanding the IPA framework to new analytical technologies like Ion Mobility-MS. You will also lead the development of the software GUI and present your findings at major international conferences, such as the annual meeting of the Metabolomics Society.
- Final Phase: Thesis and Independent Research. The final six months are dedicated to completing your independent research, finalizing the open-source software for community release on GitHub, and writing your doctoral thesis
This degree is designed for ambitious graduates holding a first-class or high 2:1 honors degree (or an equivalent international qualification) in a quantitative or life science discipline, such as Bioinformatics, Computer Science, Engineering, Biochemistry, or Systems Biology. The ideal candidate will possess a strong foundation in computational programming—particularly in Python—and a passionate interest in applying AI, Large Language Models (LLMs), and Bayesian statistics to solve complex biological challenges.
To express your interest in this project, please email your CV, a cover letter outlining your suitability to the primary supervisor, Dr. Francesco Del Carratore, at Francesco.del-carratore@liverpool.ac.uk.
Following an initial review of these informal applications, shortlisted candidates will be invited for an interview and guided through the formal submission process via the University of Liverpool Application Portal.
Funding Notes
This project is fully funded, covering tuition fees, bench fees and living expenses.
Unlock this job opportunity
View more options below
View full job details
See the complete job description, requirements, and application process




