Academic Jobs Logo
Post My Job Jobs

Calibrated Multimodal Verification to Prevent Hallucination in Robot Planning and Manipulation

Applications Close:

Post My Job

Cardiff, United Kingdom

Academic Connect
5 Star Employer Ranking

Calibrated Multimodal Verification to Prevent Hallucination in Robot Planning and Manipulation

About the Project

Project Description

Large multimodal models are increasingly used as high-level “cognitive” modules in robotics, translating natural-language instructions into plans, explanations, and intermediate goal representations. Their fluency and broad knowledge make them powerful for generalisation, but they can also hallucinate, which is a mismatch between what the model outputs and what the robot can verify in the current scene. If a plan assumes an object that is not present, omits an obstacle, or changes an object property, the robot may grasp empty space, collide, or follow an infeasible trajectory. This risk becomes sharper when goal states are generated as images, because silent changes such as adding or removing objects can slip into planning.

This PhD project reduces hallucinations by enforcing actionable consistency between language, vision, state and other modalities. The first aim is to make hallucination measurable in robotics by linking cross modal inconsistencies to plan executability and task outcomes. The second aim is to build an object conservative generation pipeline for subgoals and goal representations, with a strict requirement that object identity and count are preserved, and that key attributes and spatial relations remain valid. The third aim is to develop a benchmark and evaluation pipeline that support reproducible comparisons across methods and tasks.

The method starts from an object centric world model derived from perception, including instance segmentation, tracking, and a scene graph that serves as verifiable facts. Planning and goal generation run in a loop of proposal, verification, and repair. A model proposes a plan or goal state. A multimodal verifier checks object conservation, attribute consistency, and relation constraints against the scene graph while maintaining calibrated uncertainty. If the output contradicts perception, the system applies constrained decoding or repairs through regeneration or constrained editing. If the user instruction is underspecified, the verifier exposes missing information and the system asks a minimal clarification question rather than guessing, so ambiguity is handled differently from hallucination.

We will study the method on objects manipulation and indoor navigation, using metrics that capture object add and drop, attribute and relation consistency, and task success. The work will also produce a small benchmark of controlled failure cases and an evaluation pipeline to support reproducible comparisons. Experiments will include at least three representative tasks and comparisons with commonly used generation and planning baselines.

Keywords: multimodal models, hallucination, grounding, robotics, multimodal planning, scene graph, diffusion models, constrained generation, verification, object manipulation.

How to Apply

This project is accepting applications all year round, for self-funded candidates.

Mode of Study: Full-time or part-time

Please submit your application via Computer Science and Informatics - Study - Cardiff University

In the funding field of your application, indicate “I am applying for a self-funded PhD in Computer Science and Informatics”, and specify the project title and supervisors of this project in the text box provided.

Academic criteria: A 2:1 Honours undergraduate degree or a master's degree, in computing or a related subject. Applicants with appropriate professional experience are also considered. Degree-level mathematics (or equivalent) is required for research in some project areas.

Applicants must demonstrate English language proficiency. Students who do not have English as a first language must prove this by obtaining an IELTS score of at least 6.5 overall, with a minimum of 6.0 in each skills component. A full list of accepted qualifications is available here: https://www.cardiff.ac.uk/study/international/english-language-requirements/postgraduate

If you are interested, please contact Dr Yi Zhou (zhouy131@cardiff.ac.uk) sending your CV in the first instance. The application process requires you to develop an individual research proposal jointly with the supervision team, which builds on the information provided in this advert.

Once you have developed the proposal with support from the supervisors, please submit your application following the instructions provided below.

Please submit your application via Computer Science and Informatics - Study - Cardiff University

In order to be considered candidates must submit the following information:

  • In the ‘Research Proposal’ section of the application enter the name of the project you are applying to and upload your Individual research proposal. Your research proposal should not exceed 2000 words, including references and bibliography.
  • A personal statement (as part of the university application form, or as a separate attachment, if you prefer).
  • A CV. Guidance on CVs for a PhD position can be found on the FindAPhD website.
  • Qualification certificates and Transcripts - original and English translation, if applicable.
  • References x 2 which should be academic references. Please note you need to provide the reference documents as part of your application.
  • Proof of English language (if applicable).

Interview– If the application meets all of the entrance requirements listed above, you will be invited to an interview.

10

Unlock this job opportunity


View more options below

View full job details

See the complete job description, requirements, and application process

30 Jobs Found
View More