Valdi: Value Diffusion World Models for MPC

Christopher Lindenberg, Kashyap Chitta· July 2, 2026 View original

Summary

Valdi introduces Value Diffusion World Models, combining end-to-end online training for Model Predictive Control (MPC) with a latent diffusion dynamics model. Preliminary experiments show that Valdi, using a single diffusion step, matches deterministic MLP baselines in the CarRacing environment, highlighting a trade-off between predictive multimodality and control performance.

World models are crucial for enabling Model Predictive Control (MPC), but they require dynamics predictions that are both fast enough for real-time online use and expressive enough to capture uncertain future states. Diffusion models naturally excel at modeling uncertainty and multimodality, yet their iterative inference process typically makes them too slow for low-latency latent planning in MPC. This research bridges this gap by introducing Value Diffusion World Models, or Valdi. Valdi integrates end-to-end online training specifically for MPC with a latent diffusion dynamics model. The key innovation lies in making diffusion models practical for control by optimizing them for speed. Preliminary experiments conducted in the CarRacing environment demonstrate promising results. Valdi, even when utilizing just a single diffusion step during both training and inference, achieved performance comparable to a deterministic MLP baseline. The study also highlights an important trade-off observed in this setup: balancing the model's ability to represent multiple possible future outcomes (predictive multimodality) against its direct control performance.

Why it matters

Professionals in robotics, autonomous systems, and reinforcement learning can leverage Valdi to develop more robust and adaptable control systems that can handle uncertainty more effectively, potentially leading to safer and more efficient autonomous agents.

How to implement this in your domain

  1. 1Explore integrating Value Diffusion World Models (Valdi) into existing Model Predictive Control (MPC) frameworks for robotics or autonomous systems.
  2. 2Investigate the trade-off between predictive multimodality and control performance when designing diffusion-based world models.
  3. 3Benchmark Valdi's single-step diffusion inference against traditional deterministic dynamics models for real-time control applications.
  4. 4Adapt Valdi's online training methodology for specific control tasks requiring rapid model updates and uncertainty handling.

Who benefits

RoboticsAutonomous VehiclesIndustrial AutomationGamingLogistics

Key takeaways

  • Valdi makes diffusion models viable for low-latency Model Predictive Control.
  • It combines online MPC training with a latent diffusion dynamics model.
  • Single-step diffusion inference can match deterministic baselines in control tasks.
  • There is a trade-off between predictive multimodality and direct control performance.

Original post by Christopher Lindenberg, Kashyap Chitta

"arXiv:2607.00917v1 Announce Type: new Abstract: World models can enable Model Predictive Control (MPC), but this requires dynamics prediction that is both fast enough for online use and expressive enough to represent uncertain futures. Diffusion models offer a natural mechanism f…"

View on X

Originally posted by Christopher Lindenberg, Kashyap Chitta on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

More in AI Research

AI ResearchAI Engineering & DevTools

Human Feedback Guides Generative Meta-Learning for Robust Generalization.

This paper introduces Generative Meta-Learning with Human Feedback (GMHF), a framework that uses expert intuition to guide data synthesis and bridge the domain gap for machine learning models. GMHF employs a Conditional Neural ODE as a generative digital twin and an RL agent to refine latent physical parameters based on feedback, significantly reducing deployment loss and improving generalization under distribution shifts.

Midhun Parakkal Unni, Samuel KaskiJul 2, 2026
AI Engineering & DevToolsAI Research

Task-Aware LLM Quantization Improves Efficiency and Performance.

This paper introduces TASA (Task-Aware Sensitivity Analysis), a two-level framework for mixed-precision quantization of large language models (LLMs) that optimizes calibration data composition and bit allocation. TASA addresses the "Perplexity Illusion" and the "Alignment-Diversity Tradeoff," enabling 3.5-bit models to match or surpass 4-bit baselines by jointly considering perplexity and reasoning-oriented sensitivity.

Fei Wang, Chao Xue, Taoran Liu, Li Shen, Ye Liu, ChangXing DingJul 2, 2026
AI Engineering & DevToolsAI Research

Multi-Source Bayesian Optimization Improves Constrained Design Space Exploration.

This paper introduces a novel multi-source framework for Constrained Bayesian Optimization (BO) that efficiently identifies feasible and optimal solutions, especially in settings with small feasible regions. By integrating auxiliary data sources like surrogate models or simplified simulations, the method captures inter-source correlation and balances evaluation cost with information gain, outperforming existing approaches in early-stage exploration.

Hauke Maathuis, Roeland De Breuker, Saullo Castro, Maike OsborneJul 2, 2026