Tackling Contextual Bias in AI Models for Online Safety
About the Project
Background :
Artificial Intelligence (AI) systems are increasingly used to detect and moderate online harms, such as cyberbullying, misinformation, and disinformation. However, many current models suffer from contextual bias, they misinterpret content when removed from its social, cultural, or multimodal context. For example, AI models may:
- Misclassify sarcasm or satire as harmful misinformation.
- Fail to recognise subtle, repeated bullying when viewing a single message in isolation.
- Cannot understand how different elements, like text and imagery, interact to convey meaning.
These limitations not only reduce accuracy but also raise ethical concerns about fairness, inclusivity, and explainability. Developing context-aware AI systems is therefore critical for safeguarding digital platforms, ensuring trustworthy moderation, and protecting vulnerable users.
Research Aims
This PhD project aims to investigate and mitigate contextual bias in AI models used for detecting online harms. The research will focus on:
- Defining and operationalising contextual bias in the domain of online harms (linguistic, multimodal, cultural, and platform-specific).
- Developing context-aware AI methods that incorporate social, emotional, and interactional cues beyond surface-level content.
- Evaluating fairness and explainability in online harm detection systems, ensuring robustness across platforms and populations.
- Applying the methods to real-world challenges, with a primary focus on cyberbullying detection and misinformation/disinformation identification.
In addition, unlike approaches focused solely on technical performance, this work prioritises safety, inclusion, and accountability by designing AI systems that explain their decisions in ways people can understand, reduce unfair impacts on diverse communities, and support responsible use in real-world settings.
Unlock this job opportunity
View more options below
View full job details
See the complete job description, requirements, and application process







