TY  - EJOU
AU  - Ouardirhi, Zainab 
AU  - Zbakh, Mostapha 
AU  - Mahmoudi, Sidi Ahmed 

TI  - Bridging 2D and 3D Object Detection: Advances in Occlusion Handling through Depth Estimation
T2  - Computer Modeling in Engineering \& Sciences

PY  - 2025
VL  - 143
IS  - 3
SN  - 1526-1506

AB  - Object detection in occluded environments remains a core challenge in computer vision (CV), especially in domains such as autonomous driving and robotics. While Convolutional Neural Network (CNN)-based two-dimensional (2D) and three-dimensional (3D) object detection methods have made significant progress, they often fall short under severe occlusion due to depth ambiguities in 2D imagery and the high cost and deployment limitations of 3D sensors such as Light Detection and Ranging (LiDAR). This paper presents a comparative review of recent 2D and 3D detection models, focusing on their occlusion-handling capabilities and the impact of sensor modalities such as stereo vision, Time-of-Flight (ToF) cameras, and LiDAR. In this context, we introduce FuDensityNet, our multimodal occlusion-aware detection framework that combines Red-Green-Blue (RGB) images and LiDAR data to enhance detection performance. As a forward-looking direction, we propose a monocular depth-estimation extension to FuDensityNet, aimed at replacing expensive 3D sensors with a more scalable CNN-based pipeline. Although this enhancement is not experimentally evaluated in this manuscript, we describe its conceptual design and potential for future implementation.
KW  - Object detection; occlusion handling; multimodal fusion; monocular; 3D sensors; depth estimation

DO  - 10.32604/cmes.2025.064283