A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving

Beike Yu; Dafang Wang

doi:10.32604/cmes.2026.076439

Open Access icon Open Access

ARTICLE

A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving

Beike Yu, Dafang Wang^*

School of Mechatronics Engineering, Harbin Institute of Technology, Weihai, China

* Corresponding Author: Dafang Wang. Email: email

(This article belongs to the Special Issue: Recent Advances in Signal Processing and Computer Vision)

Computer Modeling in Engineering & Sciences 2026, 146(1), 35 https://doi.org/10.32604/cmes.2026.076439

Received 20 November 2025; Accepted 06 January 2026; Issue published 29 January 2026

Abstract

Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving, owing to its efficiency and applicability in both training and evaluating algorithms. Consequently, there has been increasing attention on generating highly realistic and consistent driving videos, particularly those involving viewpoint changes guided by the control commands or trajectories of ego vehicles. However, current reconstruction approaches, such as Neural Radiance Fields and 3D Gaussian Splatting, frequently suffer from limited generalization and depend on substantial input data. Meanwhile, 2D generative models, though capable of producing unknown scenes, still have room for improvement in terms of coherence and visual realism. To overcome these challenges, we introduce GenScene, a world model that synthesizes front-view driving videos conditioned on trajectories. A new temporal module is presented to improve video consistency by extracting the global context of each frame, calculating relationships of frames using these global representations, and fusing frame contexts accordingly. Moreover, we propose an innovative attention mechanism that computes relations of pixels within each frame and pixels in the corresponding window range of the initial frame. Extensive experiments show that our approach surpasses various state-of-the-art models in driving video generation, and the introduced modules contribute significantly to model performance. This work establishes a new paradigm for goal-oriented video synthesis in autonomous driving, which facilitates on-demand simulation to expedite algorithm development.

Keywords

Video generation; autonomous vehicle; diffusion model; trajectory

Cite This Article

APA Style

Yu, B., Wang, D. (2026). A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving. Computer Modeling in Engineering & Sciences, 146(1), 35. https://doi.org/10.32604/cmes.2026.076439

Vancouver Style

Yu B, Wang D. A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving. Comput Model Eng Sci. 2026;146(1):35. https://doi.org/10.32604/cmes.2026.076439

IEEE Style

B. Yu and D. Wang, “A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving,” Comput. Model. Eng. Sci., vol. 146, no. 1, pp. 35, 2026. https://doi.org/10.32604/cmes.2026.076439

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

A Trajectory-Guided Diffusion Model for Consistent and Realistic Video Synthesis in Autonomous Driving

Abstract

Keywords

Cite This Article

340

99

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link