Mitigating LLM-Generated Disinformation via Machine Unlearning: Toward Safer Generative AI
About the Project
False and harmful narratives propagated by large language models threaten the integrity of information in elections, public health, and crisis response. This PhD addresses that risk by creating principled, auditable ways to remove or neutralise targeted misinformation behaviours while preserving overall model utility. The work moves beyond after-the-fact filtering by changing what the model retains, delivering durable, regulator-aligned mitigations suitable for open-weight and proprietary systems.
Objectives
- Define high-risk narrative taxonomies and target selection criteria across key domains.
- Establish auditable workflows for precise knowledge removal or attenuation with minimal collateral impact.
- Ensure the durability of mitigations under model updates and adversarial attempts to restore harmful content.
- Quantify utility retention and safety gains with reproducible, decision-grade metrics.
- Produce governance artefacts (risk registers, edit logs, audit trails) aligned to EU AI Act/GDPR.
- Translate findings into actionable guidance for developers, safety teams, and regulators.
Expected Outcomes
- A safety toolkit enabling precise, auditable updates/removals with reproducible pipelines.
- An evaluation suite and red-teaming protocol measuring effectiveness, side-effects, and long-term persistence.
- Public releases: code, benchmark reports, and policy guidance; datasets where permissible.
Impact
Equips AI developers, trust-and-safety teams, and regulators with practical, evidence-based controls to reduce real-world harm from LLM-generated disinformation.
Academic qualifications
First degree (minimum 2:1 classification) in Computer Science, Machine Learning, Artificial Intelligence
English language requirement
IELTS score must be at least 6.5 (with not less than 6.0 in each of the four components). Other, equivalent qualifications will be accepted.
Essential attributes:
- Fundamental knowledge of Generative AI and Natural Language Processing
- Experience in fundamental machine learning
- Competent in programming and critical analysis
- Knowledge of the security and privacy of machine learning
- Good written and oral communication skills
- Strong motivation, with evidence of independent research skills relevant to the project
- Good time management
Desirable attributes:
- Programming experience in Python and Machine Learning frameworks (e.g., TensorFlow or Keras)
- Good knowledge of deep learning, natural language processing, etc.
- Experience in Generative AI
APPLICATION CHECKLIST
- Completed application form
- CV
- 2 academic references, using the Postgraduate Educational Reference Form (download)
- Research project outline of 2 pages (list of references excluded). The outline may provide details about
- Background and motivation of the project. The motivation, explaining the importance of the project, should be supported also by relevant literature. You can also discuss the applications you expect for the project results.
- Research questions or objectives.
- Methodology: types of data to be used, approach to data collection, and data analysis methods.
- List of references.
- The outline must be created solely by the applicant. Supervisors can only offer general discussions about the project idea without providing any additional support.
- Statement no longer than 1 page describing your motivations and fit with the project.
- Evidence of proficiency in English (if appropriate)
To be considered, the application must use the advertised title as project title
For informal enquiries about this PhD project, please contact z.tan@napier.ac.uk
PhD Start Date: October 2026
Funding Notes
International applicants should note that visa application costs and the NHS health surcharge are additional costs to be taken into consideration, and successful applicants will need to cover these expenses themselves.
Unlock this job opportunity
View more options below
View full job details
See the complete job description, requirements, and application process








