Open Access
ARTICLE
Real-Time 3D Scene Perception in Dynamic Urban Environments via Street Detection Gaussians
1 School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, 215123, China
2 Thrust of Artificial Intelligence, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, 511400, China
3 Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool, L69 7ZX, UK
4 The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
5 Institute of Deep Perception Technology, JITRI, Wuxi, 214000, China
6 Department of Electrical and Computer Engineering, Inha University, Incheon, 402751, Republic of Korea
* Corresponding Author: Yan Li. Email:
Computers, Materials & Continua 2026, 87(1), 57 https://doi.org/10.32604/cmc.2025.072544
Received 29 August 2025; Accepted 02 December 2025; Issue published 10 February 2026
Abstract
As a cornerstone for applications such as autonomous driving, 3D urban perception is a burgeoning field of study. Enhancing the performance and robustness of these perception systems is crucial for ensuring the safety of next-generation autonomous vehicles. In this work, we introduce a novel neural scene representation called Street Detection Gaussians (SDGs), which redefines urban 3D perception through an integrated architecture unifying reconstruction and detection. At its core lies the dynamic Gaussian representation, where time-conditioned parameterization enables simultaneous modeling of static environments and dynamic objects through physically constrained Gaussian evolution. The framework’s radar-enhanced perception module learns cross-modal correlations between sparse radar data and dense visual features, resulting in a 22% reduction in occlusion errors compared to vision-only systems. A breakthrough differentiable rendering pipeline back-propagates semantic detection losses throughout the entire 3D reconstruction process, enabling the optimization of both geometric and semantic fidelity. Evaluated on the Waymo Open Dataset and the KITTI Dataset, the system achieves real-time performance (135 Frames Per Second (FPS)), photorealistic quality (Peak Signal-to-Noise Ratio (PSNR) 34.9 dB), and state-of-the-art detection accuracy (78.1% Mean Average Precision (mAP)), demonstrating aKeywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools