Zero-Shot Vision-Based Robust 3D Map Reconstruction and Obstacle Detection in Geometry-Deficient Room-Scale Environments
Taehoon Kim, Sehun Lee, Junho Ahn*
Computer Science Department, Kyonggi University, Suwon, 16227, Republic of Korea
* Corresponding Author: Junho Ahn. Email:
Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.071597
Received 08 August 2025; Accepted 23 October 2025; Published online 20 November 2025
Abstract
As large, room-scale environments become increasingly common, their spatial complexity increases due to variable, unstructured elements. Consequently, demand for room-scale service robots is surging, yet most technologies remain corridor-centric, and autonomous navigation in expansive rooms becomes unstable even around static obstacles. Existing approaches face several structural limitations. These include the labor-intensive requirement for large-scale object annotation and continual retraining, as well as the vulnerability of vanishing point or line-based methods when geometric cues are insufficient. In addition, the high cost of LiDAR and 3D perception errors caused by limited wall cues and dense interior clutter further limit their effectiveness. To address these challenges, we propose a zero-shot vision-based algorithm for robust 3D map reconstruction in geometry-deficient room-scale environments. The algorithm operates in three layers: Layer 1 performs dimension-wise boundary detection; Layer 2 estimates vanishing points, refines the precise perspective space, and extracts a floor mask; and Layer 3 conducts 3D spatial mapping and obstacle recognition. The proposed method was experimentally validated across various geometric-deficient room-scale environments, including lobbies, seminar rooms, conference rooms, cafeterias, and museums—demonstrating its ability to reliably reconstruct 3D maps and accurately recognize obstacles. Experimental results show that the proposed algorithm achieved an F1 score of 0.959 in precision perspective space detection and 0.965 in floor mask extraction. For obstacle recognition and classification, it obtained F1 scores of 0.980 in obstacle absent areas, 0.913 in solid obstacle environments, and 0.939 in skeleton-type sparse obstacle environments, confirming its high precision and reliability in geometric-deficient room-scale environments.
Keywords
Spatial AI; zero-shot learning; geometric deficiency; 3D map reconstruction; room-scale environment; sparse obstacle; precise classification