Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.073846
Special Issues
Table of Content

Open Access

ARTICLE

Semantic-Guided Stereo Matching Network Based on Parallax Attention Mechanism and SegFormer

Zeyuan Chen, Yafei Xie, Jinkun Li, Song Wang, Yingqiang Ding*
School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, 450001, China
* Corresponding Author: Yingqiang Ding. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.073846

Received 27 September 2025; Accepted 25 November 2025; Published online 18 December 2025

Abstract

Stereo matching is a pivotal task in computer vision, enabling precise depth estimation from stereo image pairs, yet it encounters challenges in regions with reflections, repetitive textures, or fine structures. In this paper, we propose a Semantic-Guided Parallax Attention Stereo Matching Network (SGPASMnet) that can be trained in unsupervised manner, building upon the Parallax Attention Stereo Matching Network (PASMnet). Our approach leverages unsupervised learning to address the scarcity of ground truth disparity in stereo matching datasets, facilitating robust training across diverse scene-specific datasets and enhancing generalization. SGPASMnet incorporates two novel components: a Cross-Scale Feature Interaction (CSFI) block and semantic feature augmentation using a pre-trained semantic segmentation model, SegFormer, seamlessly embedded into the parallax attention mechanism. The CSFI block enables effective fusion of multi-scale features, integrating coarse and fine details to enhance disparity estimation accuracy. Semantic features, extracted by SegFormer, enrich the parallax attention mechanism by providing high-level scene context, significantly improving performance in ambiguous regions. Our model unifies these enhancements within a cohesive architecture, comprising semantic feature extraction, an hourglass network, a semantic-guided cascaded parallax attention module, output module, and a disparity refinement network. Evaluations on the KITTI2015 dataset demonstrate that our unsupervised method achieves a lower error rate compared to the original PASMnet, highlighting the effectiveness of our enhancements in handling complex scenes. By harnessing unsupervised learning without ground truth disparity needed, SGPASMnet offers a scalable and robust solution for accurate stereo matching, with superior generalization across varied real-world applications.

Keywords

Stereo matching; parallax attention; unsupervised learning; convolutional neural network; stereo correspondence
  • 39

    View

  • 8

    Download

  • 0

    Like

Share Link