Open Access iconOpen Access

ARTICLE

DeepEchoNet: A Lightweight Architecture for Low Resolution Monocular Depth Estimation

Giulio Caporro1, Paolo Russo2,*

1 Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
2 Department of Civil, Computer Science and Aeronautical Technologies Engineering, Roma Tre University, Rome, Italy

* Corresponding Author: Paolo Russo. Email: email

(This article belongs to the Special Issue: Advances in Efficient Vision Transformers: Architectures, Optimization, and Applications)

Computers, Materials & Continua 2026, 88(1), 14 https://doi.org/10.32604/cmc.2026.079331

Abstract

Monocular depth estimation (MDE) has become a practical alternative to active range sensing in many indoor scenarios, enabled by supervised deep learning models that predict dense depth maps from a single RGB image. However, most modern MDE systems assume mid-to-high resolution inputs and non-trivial compute budgets, limiting their direct applicability in embedded and bandwidth-constrained settings. This paper studies low resolution MDE, focusing on 96×96 inputs, where geometric cues are strongly degraded and naively downsizing high-resolution architectures often leads to unstable training and poor accuracy. We propose DeepEchoNet, a lightweight hybrid CNN-transformer model tailored to operate natively at 96×96 resolution. The design combines a MobileViT-inspired encoder with MobileNetV2-style inverted residual blocks and lightweight transformer blocks, and a guided decoder that selectively fuses multi-scale skip features through efficient recalibration modules and separable convolutions. We further adopt a training objective that is aware of low resolution, along with a joint RGB–depth augmentation pipeline that includes a strong-to-weak schedule, to improve robustness while preserving coarse geometric consistency.

Graphic Abstract

DeepEchoNet: A Lightweight Architecture for Low Resolution Monocular Depth Estimation

Keywords

Monocular depth estimation; lightweight neural networks; mobile vision transformers; encoder-decoder architectures; edge deployment; low resolution

Cite This Article

APA Style
Caporro, G., Russo, P. (2026). DeepEchoNet: A Lightweight Architecture for Low Resolution Monocular Depth Estimation. Computers, Materials & Continua, 88(1), 14. https://doi.org/10.32604/cmc.2026.079331
Vancouver Style
Caporro G, Russo P. DeepEchoNet: A Lightweight Architecture for Low Resolution Monocular Depth Estimation. Comput Mater Contin. 2026;88(1):14. https://doi.org/10.32604/cmc.2026.079331
IEEE Style
G. Caporro and P. Russo, “DeepEchoNet: A Lightweight Architecture for Low Resolution Monocular Depth Estimation,” Comput. Mater. Contin., vol. 88, no. 1, pp. 14, 2026. https://doi.org/10.32604/cmc.2026.079331



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 302

    View

  • 46

    Download

  • 0

    Like

Share Link