LCDM-Mono: Lightweight Conditional Diffusion Model for Self-Supervised Monocular Depth Estimation

Hao Li; Zhoujingzi Qiu; Jianxiao Zou; Haojie Wu; Shicai Fan

doi:10.32604/cmc.2026.076784

Open Access icon Open Access

ARTICLE

LCDM-Mono: Lightweight Conditional Diffusion Model for Self-Supervised Monocular Depth Estimation

Hao Li^1,2, Zhoujingzi Qiu^1,2, Jianxiao Zou^1,2, Haojie Wu¹, Shicai Fan^1,2,*

1 University of Electronic Science and Technology of China, Chengdu, China
2 Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, China

* Corresponding Author: Shicai Fan. Email: email

(This article belongs to the Special Issue: Advances in Intelligent Video Object Tracking and Scene Understanding)

Computers, Materials & Continua 2026, 87(3), 55 https://doi.org/10.32604/cmc.2026.076784

Received 26 November 2025; Accepted 04 February 2026; Issue published 09 April 2026

Abstract

Self-supervised monocular depth estimation has attracted considerable attention due to its ability to learn without ground-truth depth annotations and its strong scalability. However, existing approaches still suffer from inaccurate object boundaries and limited inference efficiency. To address these issues, we present a Lightweight Conditional Diffusion Model for Monocular Depth Estimation (LCDM-Mono). The proposed framework integrates an efficient diffusion inference strategy with a knowledge distillation scheme, enabling the model to generate high-quality depth maps with only two sampling steps during inference. This design substantially reduces computational overhead and ensures real-time performance on resource-constrained platforms. In addition, we introduce a surface normal-based distillation loss to transfer geometric priors from the teacher network to the student network, enhancing its ability to recover local 3D structures and boundary details. Extensive experiments demonstrate that LCDM-Mono achieves a well-balanced trade-off between accuracy and efficiency. On the Jetson Orin NX platform, it achieves real-time inference at approximately 28 FPS, validating its practical deployability and effectiveness.

Keywords

Monocular depth estimation; self-supervised learning; diffusion model; knowledge distillation

Cite This Article

APA Style

Li, H., Qiu, Z., Zou, J., Wu, H., Fan, S. (2026). LCDM-Mono: Lightweight Conditional Diffusion Model for Self-Supervised Monocular Depth Estimation. Computers, Materials & Continua, 87(3), 55. https://doi.org/10.32604/cmc.2026.076784

Vancouver Style

Li H, Qiu Z, Zou J, Wu H, Fan S. LCDM-Mono: Lightweight Conditional Diffusion Model for Self-Supervised Monocular Depth Estimation. Comput Mater Contin. 2026;87(3):55. https://doi.org/10.32604/cmc.2026.076784

IEEE Style

H. Li, Z. Qiu, J. Zou, H. Wu, and S. Fan, “LCDM-Mono: Lightweight Conditional Diffusion Model for Self-Supervised Monocular Depth Estimation,” Comput. Mater. Contin., vol. 87, no. 3, pp. 55, 2026. https://doi.org/10.32604/cmc.2026.076784

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

LCDM-Mono: Lightweight Conditional Diffusion Model for Self-Supervised Monocular Depth Estimation

Abstract

Keywords

Cite This Article

380

94

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link