Open Access iconOpen Access

ARTICLE

A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving

Beike Yu, Dafang Wang*

School of Mechatronics Engineering, Harbin Institute of Technology, Weihai, China

* Corresponding Author: Dafang Wang. Email: email

(This article belongs to the Special Issue: Recent Advances in Signal Processing and Computer Vision)

Computer Modeling in Engineering & Sciences 2026, 146(1), 35 https://doi.org/10.32604/cmes.2026.076439

Abstract

Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving, owing to its efficiency and applicability in both training and evaluating algorithms. Consequently, there has been increasing attention on generating highly realistic and consistent driving videos, particularly those involving viewpoint changes guided by the control commands or trajectories of ego vehicles. However, current reconstruction approaches, such as Neural Radiance Fields and 3D Gaussian Splatting, frequently suffer from limited generalization and depend on substantial input data. Meanwhile, 2D generative models, though capable of producing unknown scenes, still have room for improvement in terms of coherence and visual realism. To overcome these challenges, we introduce GenScene, a world model that synthesizes front-view driving videos conditioned on trajectories. A new temporal module is presented to improve video consistency by extracting the global context of each frame, calculating relationships of frames using these global representations, and fusing frame contexts accordingly. Moreover, we propose an innovative attention mechanism that computes relations of pixels within each frame and pixels in the corresponding window range of the initial frame. Extensive experiments show that our approach surpasses various state-of-the-art models in driving video generation, and the introduced modules contribute significantly to model performance. This work establishes a new paradigm for goal-oriented video synthesis in autonomous driving, which facilitates on-demand simulation to expedite algorithm development.

Keywords

Video generation; autonomous vehicle; diffusion model; trajectory

Cite This Article

APA Style
Yu, B., Wang, D. (2026). A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving. Computer Modeling in Engineering & Sciences, 146(1), 35. https://doi.org/10.32604/cmes.2026.076439
Vancouver Style
Yu B, Wang D. A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving. Comput Model Eng Sci. 2026;146(1):35. https://doi.org/10.32604/cmes.2026.076439
IEEE Style
B. Yu and D. Wang, “A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving,” Comput. Model. Eng. Sci., vol. 146, no. 1, pp. 35, 2026. https://doi.org/10.32604/cmes.2026.076439



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 189

    View

  • 41

    Download

  • 0

    Like

Share Link