Promote Your Research… Share it Worldwide
Have a story or a research paper to share? Become a contributor and publish your work on AcademicJobs.com.
Submit your Research - Make it Global NewsMBZUAI's DP-Fusion Ushers in a New Era of Secure AI Inference
In the rapidly evolving landscape of artificial intelligence, ensuring data privacy has become a paramount concern, especially as large language models (LLMs) increasingly interact with sensitive information. Researchers at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in Abu Dhabi have introduced DP-Fusion, a groundbreaking token-level differentially private inference method that safeguards user data while maintaining high model performance. This innovation not only addresses critical vulnerabilities in LLM outputs but also positions the United Arab Emirates as a frontrunner in trustworthy AI development.
DP-Fusion arrives at a pivotal moment for UAE's higher education sector, where MBZUAI, the world's first dedicated graduate university for AI established in 2019, continues to drive national ambitions under the UAE AI Strategy 2031. By blending advanced privacy mechanisms with practical utility, this research exemplifies how UAE universities are tackling global challenges head-on.
Understanding Differential Privacy in the Context of AI
Differential privacy (DP) is a mathematical framework that quantifies privacy risks by ensuring that the output of an algorithm changes negligibly whether or not any single individual's data is included in the input dataset. Defined formally, a mechanism satisfies (ε, δ)-differential privacy if, for any two adjacent datasets differing by one record, the probability of any output is bounded by e^ε with probability 1-δ.
In AI applications, particularly LLMs like GPT or Qwen, privacy risks escalate during inference when users input prompts containing personally identifiable information (PII) such as names, dates, or medical records. Traditional DP methods applied at training time, like DP-SGD, are insufficient for inference scenarios common in agentic AI systems—where models query external tools or databases. DP-Fusion bridges this gap by providing provable guarantees at the inference stage.
The Privacy Paradox in LLM Inference
LLMs excel at generating coherent text but can inadvertently leak sensitive context through paraphrasing or pattern memorization. For instance, prompting an LLM with a legal document containing client names might result in outputs where attackers recover PII via token recovery attacks or perplexity-based guessing. Existing defenses like DP-Prompt (noise on entire prompt) or DP-Decoding (sampling from noised logits) suffer from poor utility: strong privacy (low ε) yields gibberish outputs, while weak privacy fails against sophisticated attacks.
DP-Fusion resolves this paradox by operating at the token level, focusing protection on labeled sensitive tokens rather than the whole input. This granular approach yields 6x lower perplexity than baselines, making outputs readable and useful even under stringent privacy budgets.
How DP-Fusion Works: A Step-by-Step Breakdown
DP-Fusion's elegance lies in its post-hoc, training-free design. Here's the process:
- Token Labeling: Use named entity recognition (NER) to tag sensitive tokens into privacy groups (e.g., PERSON, DATE, ORG). MBZUAI's in-house NER module achieves high accuracy on diverse entities.
- Baseline Generation: Run the LLM on a public version of the input with all sensitive tokens masked, producing a baseline logit distribution P_public.
- Private Runs: For each privacy group g, generate a private logit distribution P_g by including only that group's tokens.
- Fusion and Noising: Sample next tokens by blending: with probability proportional to privacy parameters α (global) and β_g (group-specific), draw from P_public or P_g, then apply Gaussian noise calibrated to ε. The output distribution remains ε-close to the baseline, bounding sensitive influence.
This fusion mechanism ensures mathematical privacy while preserving semantic flow. A live demo at documentprivacy.com showcases real-time document sanitization.
Rigorous Experiments Validate Superior Performance
Tested on the TAB-ECHR dataset (European Court of Human Rights cases annotated for PII), DP-Fusion used Qwen2.5-7B-Instruct. Utility metrics showed perplexity of 1.42-1.46 (vs. higher for baselines), with LLM-as-judge win rates confirming naturalness. Privacy via token recovery attacks: adversaries guessing at 26-29% accuracy (near random 20%), far below non-private 80%+.
At ε=0.1 (strong privacy), DP-Fusion outperformed DP-Decoding by generating coherent paraphrases, e.g., masking "John Doe, born 01/01/1980" yields sanitized yet informative text. Code available on GitHub enables replication.
The Team Driving UAE's AI Privacy Frontier
Lead author Rushil Thareja, a PhD candidate in NLP at MBZUAI, spearheaded development alongside Assistant Professors Nils Lukas and Praneeth Vepakomma in Machine Learning, and NLP Chair Professor Preslav Nakov. Their interdisciplinary expertise—from theoretical DP proofs to practical NER—fuels this work. Presented at ICLR 2026 in Rio, the paper (arXiv:2507.04531) has sparked global interest.
MBZUAI's faculty, drawn globally, embody UAE's vision: 100% PhD-holding, with alumni at Google DeepMind and Meta. Thareja notes, "DP-Fusion formalizes safeguards essential for real-world AI trust."
Implications for Agentic AI and Beyond
In agentic systems—LLMs orchestrating tools like databases or APIs—DP-Fusion prevents cascade leaks, e.g., a healthcare agent querying patient records outputs sanitized summaries. It mitigates prompt injection (0% success at low ε) and jailbreaks, vital for UAE sectors like finance (ADGM regulations) and health (UAE Genomics).
For UAE, this aligns with Federal Law No. 45/2021 on Personal Data Protection, akin to GDPR, positioning MBZUAI as ethical AI hub.
UAE's Thriving AI Ecosystem and MBZUAI's Role
UAE invests AED 112 billion in AI by 2031, with MBZUAI central: partnerships with G42, IBM; Falcon LLM; AI Campus. Ranked top MEA AI uni (QS 2026), it graduates AI specialists amid 40% UAE GDP AI-boosted by 2031.
DP-Fusion exemplifies UAE's shift from oil to AI leadership, fostering secure innovation in smart cities, healthcare.
Future Outlook: Scaling Privacy in UAE AI Research
Future work: integrate with multimodal LLMs, federated learning; deploy in UAE gov apps. PyPI library (dp-fusion-lib) accelerates adoption. For students, MBZUAI's MSc/PhD programs offer hands-on trustworthy AI.
As UAE universities like Khalifa, NYUAD advance, DP-Fusion sets benchmark for privacy-preserving AI, ensuring ethical growth.
Photo by Shengnan Gao on Unsplash
Career Opportunities in UAE's AI Privacy Field
- Research roles at MBZUAI focusing DP mechanisms.
- Industry: G42, Bayzat applying inference privacy.
- Academia: faculty positions in NLP/ML.
UAE's visa reforms attract global talent, with salaries AED 30k-60k/month for AI experts.

Be the first to comment on this article!
Please keep comments respectful and on-topic.