PhD Studentship in Monitoring and Increasing LLM Saf: 655136

Q: What qualifications are required for this PhD studentship?

Applicants must have (or expect to obtain) a first-class degree in Engineering or a related subject. Ideally, candidates have experience in software development projects or LLM research . This fully-funded opportunity suits motivated researchers interested in AI safety . Explore more research jobs at AcademicJobs.com.

Q: How do I apply for this LLM safety PhD studentship?

Upload your two-page CV and research proposal via the application form: Google Form . Applications are evaluated on a rolling basis. Secure a PhD place separately through the University of Cambridge portal (£20 fee). Check research assistant jobs for similar roles.

Q: What are the research projects in this PhD?

The first 1.5 years focus on Project 1: Testing CoT faithfulness via perturbation methods (e.g., paraphrasing) and mitigating deceptive behavior using mechanistic interpretability . Or Project 2: Training for transparency with a human/AI predictor evaluating reasoning predictability. Funded by Coefficient Giving on encoded reasoning . See related postdoc advice .

Q: Is this PhD studentship open to international students?

Yes, it's a fully-funded studentship covering fees and maintenance for both home and overseas candidates , indicating visa sponsorship potential. Confirm via University admissions. Ideal for global talent in LLM safety research . Browse scholarships for more funding options.

Q: What is the application deadline and start dates?

Funding closes 30 July 2026 ; University admissions deadlines are 14 May for October start or 30 July for January start—apply early. Rolling evaluation for funding. Reference: NM49585 . View Cambridge research jobs for updates.

Is this job right for you? View Vital Job Information and Save Time

PhD Studentship in Monitoring and Increasing LLM Safety

PhD

30 July 2026

Location

Cambridge

University of Cambridge

Type

Fully-funded Studentship

Salary

Fully-funded (fees + maintenance)

Visa Sponsorship

Required Qualifications

First-class degree in Engineering or related

Software development experience

LLM research experience

Research Areas

LLM Safety

Mechanistic Interpretability

Chain-of-Thought (CoT) Faithfulness

Encoded Reasoning

86% Job Post Completeness

Our Job Post Completeness indicates how much vital information has been provided for this job listing. Academic Jobs has done the heavy lifting for you and summarized all the important aspects of this job to save you time.

PhD Studentship in Monitoring and Increasing LLM Safety

LLMs are becoming more capable and society increasingly relies on them. This makes it important to ensure LLMs are safe. In this PhD you can use a variety of approaches, such as white-box mechanistic interpretability and black-box behavioural research to evaluate the safety of LLMs, monitor their behaviour at inference time, as well as devise strategies for reducing risk from LLMs. Initially, this PhD will focus on increasing CoT faithfulness and mitigating encoded reasoning.

This PhD is funded by Coefficient Giving, which has the following focus areas https://coefficientgiving.org/tais-rfp-research-areas/#6-encoded-reasoning-in-cot-and-inter-model-communication

The first 1.5 years of this PhD are scoped out and will be about investigating and carrying out either project 1 or project 2 (described below). After these projects have been completed to the highest standard, you will together with your supervisor and Coefficient Giving decide how to proceed, and what to investigate next.

Project 1: Test for straightforward meaning of CoT and mitigate deceptive behaviour via "perturbation methods".

First apply a CoT perturbation method (e.g. applying paraphrasing to intermediate outputs). You then compare the final outputs after the CoT is perturbed with baseline final outputs. Performance deterioration after applying perturbation methods, indicates the model was using words in the CoT in a non-straightforward way. If you find performance deterioration after applying perturbation methods, the next step is investigating (for example using mechanistic interpretability) the underlying cause e.g. the model using a secret code or prompt hacking itself.

Project 2: Train for transparency using a human predictor

Use a human (or AI imitating human behavior, e.g. an LLM) to evaluate whether the final model outputs (and counterfactual outputs) can be predicted based on the CoT. The accuracy of this human predictor is a measure of reasoning transparency and can be used as reward during training.

Qualifications required (edit as necessary): Applicants should have (or expect to obtain by the start date) at least a first degree in an Engineering or related subject.

Ideally applicants have some experience with either software development projects or research on LLMs.

This is a fully-funded studentship (fees and maintenance) to cover a home or overseas candidate.

To apply for this studentship, please upload your two page CV and research proposal in this form https://forms.gle/Cm3MWPsWta73J2Gp7. The form responses will be evaluated on a rolling basis

Please note that any offer of funding will be conditional on securing a place as a PhD student. Candidates will need to apply separately for admission through the University's Graduate Admissions application portal; this can be done before or after applying for this funding opportunity. The applicant portal can be accessed via: www.graduate.study.cam.ac.uk/courses/directory/egegpdpeg. University Postgraduate Admissions closing dates are 14 May for October start and 30 July for January start, although it is advisable to apply earlier than this. Please note that there is an application fee of £20 to apply via the Postgraduate Application Portal.

The University actively supports equality, diversity and inclusion and encourages applications from all sections of society.

Key information

Department/location

Department of Engineering

Salary

Reference

NM49585

Category

Studentships

Date published

30 April 2026

Closing date

30 July 2026

Find Your Best Opportunity

Tell them AcademicJobs.com sent you!

Frequently Asked Questions

🎓What qualifications are required for this PhD studentship?

Applicants must have (or expect to obtain) a first-class degree in Engineering or a related subject. Ideally, candidates have experience in software development projects or LLM research. This fully-funded opportunity suits motivated researchers interested in AI safety. Explore more research jobs at AcademicJobs.com.

📝How do I apply for this LLM safety PhD studentship?

Upload your two-page CV and research proposal via the application form: Google Form. Applications are evaluated on a rolling basis. Secure a PhD place separately through the University of Cambridge portal (£20 fee). Check research assistant jobs for similar roles.

🔬What are the research projects in this PhD?

The first 1.5 years focus on Project 1: Testing CoT faithfulness via perturbation methods (e.g., paraphrasing) and mitigating deceptive behavior using mechanistic interpretability. Or Project 2: Training for transparency with a human/AI predictor evaluating reasoning predictability. Funded by Coefficient Giving on encoded reasoning. See related postdoc advice.

🌍Is this PhD studentship open to international students?

Yes, it's a fully-funded studentship covering fees and maintenance for both home and overseas candidates, indicating visa sponsorship potential. Confirm via University admissions. Ideal for global talent in LLM safety research. Browse scholarships for more funding options.

⏰What is the application deadline and start dates?

Funding closes 30 July 2026; University admissions deadlines are 14 May for October start or 30 July for January start—apply early. Rolling evaluation for funding. Reference: NM49585. View Cambridge research jobs for updates.

🚀What happens after the initial PhD projects?

Post-1.5 years (after excelling in Project 1 or 2), collaborate with your supervisor and Coefficient Giving to define next steps in LLM safety monitoring, such as inference-time behavior or risk reduction. Focus areas align with TAIS RFP. Prepare with our research tips.

PhD Studentship in Monitoring and Increasing LLM Safety

University of Cambridge

Cambridge

PhD Studentship in Monitoring and Increasing LLM Safety

Required Qualifications

Research Areas

PhD Studentship in Monitoring and Increasing LLM Safety

PhD Studentship in Monitoring and Increasing LLM Safety

Key information

Frequently Asked Questions

🎓What qualifications are required for this PhD studentship?

📝How do I apply for this LLM safety PhD studentship?

🔬What are the research projects in this PhD?

🌍Is this PhD studentship open to international students?

⏰What is the application deadline and start dates?

🚀What happens after the initial PhD projects?

Funded PhD Position: Impact of Indoor and Outdoor Learning Environments on Children’s Language and Communication Using AI

PhD Studentship in Chemical Engineering

PhD Studentship in Mechanical Engineering

PhD Scholarship in Computer Science

New Digital Transformation PhD Studentship at UEA

PhD Studentship in the Biosciences Institute: Deciphering the Role of Transcriptional Enhancers in Gene Regulation and Osteoarthritis

PhD Studentship in Microbiome and Parkinson’s Disease: A pan‑African Multi‑omic and Organoid Investigation

AHRC Collaborative Doctoral Partnership Award - The Making of the National Archive: The First Century of the Public Record Office

Assessing Environmental, Economic, and Resilience Trade-offs in Beef Production Systems

PhD Studentship - Process Industries: Net Zero CDT: Lifecycle & Systems Modelling of Future Marine Fuels and Engine Systems

Clinical Doctoral Fellowship

PhD Studentship: Solution Landscapes and Inverse Problems in the Landau-de Gennes theory for Liquid Crystals

PhD Studentship - Data-driven Approaches to Viscoelastic Flow Control

Fully funded PhD studentships at the EPSRC Centre for Doctoral Training in Statistics and Operational Research in Partnership with Industry (STOR-i)

PhD Studentship: Digital Systems Governance for Smart, Sustainable, Green Innovation Districts: Designing a Living Lab Framework for the Birmingham Knowledge Quarter

PhD Studentship: Medium Voltage Power Electronics for Next-Generation Energy Systems

PhD Studentship: Measuring Confidence in Heterogeneous Verification

PhD Studentship: Quantum Algorithms for Nuclear Level Densities

Phd Studentship: Techno-Economic Optimisation of Robotic Inspection for Circular Offshore Energy Assets

PhD Studentship: Autonomous Mobile Robotic Detection and Localisation of Gas Leaks within Industrial Facilities