Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.076784
Special Issues
Table of Content

Open Access

ARTICLE

LCDM-Mono: Lightweight Conditional Diffusion Model for Self-Supervised Monocular Depth Estimation

Hao Li1,2, Zhoujingzi Qiu1,2, Jianxiao Zou1,2, Haojie Wu1, Shicai Fan1,2,*
1 University of Electronic Science and Technology of China, Chengdu, China
2 Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, China
* Corresponding Author: Shicai Fan. Email: email
(This article belongs to the Special Issue: Advances in Intelligent Video Object Tracking and Scene Understanding)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.076784

Received 26 November 2025; Accepted 04 February 2026; Published online 28 February 2026

Abstract

Self-supervised monocular depth estimation has attracted considerable attention due to its ability to learn without ground-truth depth annotations and its strong scalability. However, existing approaches still suffer from inaccurate object boundaries and limited inference efficiency. To address these issues, we present a Lightweight Conditional Diffusion Model for Monocular Depth Estimation (LCDM-Mono). The proposed framework integrates an efficient diffusion inference strategy with a knowledge distillation scheme, enabling the model to generate high-quality depth maps with only two sampling steps during inference. This design substantially reduces computational overhead and ensures real-time performance on resource-constrained platforms. In addition, we introduce a surface normal-based distillation loss to transfer geometric priors from the teacher network to the student network, enhancing its ability to recover local 3D structures and boundary details. Extensive experiments demonstrate that LCDM-Mono achieves a well-balanced trade-off between accuracy and efficiency. On the Jetson Orin NX platform, it achieves real-time inference at approximately 28 FPS, validating its practical deployability and effectiveness.

Keywords

Monocular depth estimation; self-supervised learning; diffusion model; knowledge distillation
  • 3

    View

  • 0

    Download

  • 0

    Like

Share Link