Agentic AI for Autonomous Things: Vision-Language-Action Models, Embodied Intelligence and Real-World Decision-Making
About the Project
Artificial intelligence is moving beyond passive prediction towards systems that can reason, plan, interact and act. This shift is driving rapid progress in agentic AI - i.e., AI systems that can interpret goals, make decisions, use tools, coordinate with humans and operate in changing real-world environments. Recent advances in large language models, vision-language models and vision-language-action models have created new opportunities to develop intelligent agents that do not simply classify images or generate text, but can understand context, plan actions and support autonomous decision-making.
This PhD project will explore the foundations and applications of agentic AI, with a particular focus on autonomous systems, embodied intelligence and autonomy of things. Possible application areas include robotics, intelligent sensing, autonomous inspection, environmental monitoring, smart laboratories, precision agriculture, digital assistants, scientific discovery and AI-driven decision support.
The central research question is how we can design agentic AI systems that are reliable, interpretable, adaptive and useful in the real world. Current AI agents remain limited by hallucination, weak grounding, poor long-horizon planning, brittle tool use, limited physical understanding and difficulty transferring from simulated or text-based settings to complex real-world environments. Addressing these challenges requires fundamental research across machine learning, computer vision, natural language processing, robotics, multimodal learning and human-AI interaction.
There is a wide range of possible research directions, including:
- Vision-language-action models for embodied AI: Developing models that connect visual perception, language understanding and action generation for robots, autonomous vehicles, drones or intelligent devices.
- Agentic AI for autonomous systems: Designing AI agents that can plan, reason, monitor progress, recover from errors and adapt their behaviour in dynamic environments.
- Tool-using and workflow-aware AI agents: Creating systems that can use external tools, retrieve information, call software functions, interact with data pipelines and support complex human workflows.
- Multimodal foundation models for real-world perception: Combining images, video, language, sensor streams and structured data to improve situational awareness and decision-making.
- Reliable and trustworthy agentic AI: Investigating uncertainty estimation, evaluation, safety constraints, explainability, memory, verification and human oversight in AI agents.
- Agentic AI for robotics and autonomy of things: Exploring how intelligent agents can control or assist physical systems such as mobile robots, drones, laboratory instruments, smart sensors or autonomous inspection platforms.
- Self-improving and lifelong AI agents: Studying how agents can learn from feedback, adapt to new environments and improve performance over time without requiring extensive manual annotation.
- Human-agent collaboration: Designing AI agents that can communicate with users, clarify goals, explain decisions and operate as effective collaborators rather than opaque automation tools.
The project may involve developing new algorithms, building agentic AI pipelines, evaluating foundation models, creating benchmark tasks, integrating AI agents with robotic or sensing platforms or applying agentic systems to real-world scientific and industrial problems. Depending on the student’s interests, the work may be highly theoretical, highly applied or a combination of both.
Our resources
Durham University provides an excellent environment for AI, computer vision, robotics and autonomous systems research. The University hosts the UK regional supercomputer, Bede, with 128 NVIDIA V100 GPUs, and the Department hosts an NVIDIA CUDA Centre to support GPU-intensive research.
Our laboratories include LiDAR, RADAR, EEG systems, drones, cameras, embedded computing devices and robotic platforms. The student may also have opportunities to work with advanced systems such as the Unitree G1 humanoid robot, the Unitree Go2 quadruped robot, unmanned ground vehicles, aerial robots and other autonomous platforms.
Supervision
You will be supervised by Dr Amir Atapour-Abarghouei (Durham University Profile), Associate Professor in Machine Learning and Computer Vision in the Department of Computer Science at Durham University. His research spans computer vision, deep learning, robotic perception, neuromorphic computing, efficient AI and autonomous systems, with a focus on developing intelligent visual systems that can operate under real-world constraints.
Dr. Atapour-Abarghouei has published in top-tier conferences and journals, including CVPR, ICCV, ECCV, ICML, IEEE Transactions on Image Processing, and IEEE Transactions on Multimedia. His work has been widely cited, demonstrating a strong impact on the fields of AI-driven vision, robotics, and machine learning. He has been involved in multiple national and international research projects, including EU-funded initiatives and collaborations with industry leaders in autonomous robotics, AI-driven perception, and deep learning efficiency.
During the PhD, you will receive comprehensive research training, including:
- Regular one-to-one meetings to guide your research direction, ensure steady progress, and refine your problem-solving skills.
- Support in academic writing and publishing, helping you target high-impact conferences and journals.
- Collaboration opportunities within a dynamic and interdisciplinary research group, allowing you to work with experts in AI, robotics, and computational intelligence.
- Access to cutting-edge computational resources, including Durham University’s GPU cluster and state-of-the-art robotic platforms.
- Opportunities to engage with industry and external collaborators, facilitating real-world applications and potential career pathways beyond academia.
- Support in developing an independent research profile and building a strong academic or industrial career pathway
Dr. Atapour-Abarghouei has supervised numerous undergraduate, MSc, and PhD students, many of whom have gone on to pursue successful careers in research, academia, and industry. His approach to supervision emphasises independent thinking, problem-driven research, and a supportive learning environment to help students develop into well-rounded researchers with expertise in cutting-edge AI and robotics technologies.
Durham University
Durham University is one of the UK’s leading research-intensive universities and a member of the Russell Group. Durham is a historic university city in North East England, known for its collegiate system, strong academic community and distinctive environment. The city offers an excellent quality of life, with a compact campus, rich cultural heritage and comparatively affordable living costs.
Entry requirements
Applicants should have:
- A relevant undergraduate or master’s degree in computer science, artificial intelligence, engineering, mathematics, physics or a related discipline
- Strong programming skills
- Interest in machine learning, computer vision, robotics, NLP, multimodal AI or autonomous systems
- Motivation to conduct independent research and publish in high-quality academic venues
- Ability to meet Durham University’s English language requirements (https://www.dur.ac.uk/study/international/entry-requirements/english-language-requirements/).
How to apply
Please send an email with your CV, transcripts and any supporting documents to Dr Amir Atapour-Abarghouei at amir.atapour-abarghouei@durham.ac.uk
Funding Notes
This is a self-funded PhD position and applications are welcome all year round.
Students of exceptional quality may consider applying for scholarships (including CSC, and Durham University's studentships), which open in November every year. These funding opportunities are extremely competitive.
Unlock this job opportunity
View more options below
View full job details
See the complete job description, requirements, and application process









