Open Access iconOpen Access

ARTICLE

Encoder-Guided Latent Space Search Based on Generative Networks for Stereo Disparity Estimation in Surgical Imaging

Guangyu Xu1,2, Siyuan Xu3, Siyu Lu4,*, Yuxin Liu1, Bo Yang1, Junmin Lyu5, Wenfeng Zheng1,*

1 School of Automation, University of Electronic Science and Technology of China, Chengdu, 611731, China
2 School of the Environment, The University of Queensland, Brisbane, QLD 4072, Australia
3 Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
4 Department of Geography, Texas A&M University, College Station, TX 77843, USA
5 School of Artificial Intelligence, Guangzhou Huashang university, Guangzhou, 511300, China

* Corresponding Authors: Siyu Lu. Email: email; Wenfeng Zheng. Email: email

(This article belongs to the Special Issue: Recent Advances in Signal Processing and Computer Vision)

Computer Modeling in Engineering & Sciences 2025, 145(3), 4037-4053. https://doi.org/10.32604/cmes.2025.074901

Abstract

Robust stereo disparity estimation plays a critical role in minimally invasive surgery, where dynamic soft tissues, specular reflections, and data scarcity pose major challenges to traditional end-to-end deep learning and deformable model-based methods. In this paper, we propose a novel disparity estimation framework that leverages a pretrained StyleGAN generator to represent the disparity manifold of Minimally Invasive Surgery (MIS) scenes and reformulates the stereo matching task as a latent-space optimization problem. Specifically, given a stereo pair, we search for the optimal latent vector in the intermediate latent space of StyleGAN, such that the photometric reconstruction loss between the stereo images is minimized while regularizing the latent code to remain within the generator’s high-confidence region. Unlike existing encoder-based embedding methods, our approach directly exploits the geometry of the learned latent space and enforces both photometric consistency and manifold prior during inference, without the need for additional training or supervision. Extensive experiments on stereo-endoscopic videos demonstrate that our method achieves high-fidelity and robust disparity estimation across varying lighting, occlusion, and tissue dynamics, outperforming Thin Plate Spline (TPS)-based and linear representation baselines. This work bridges generative modeling and 3D perception by enabling efficient, training-free disparity recovery from pre-trained generative models with reduced inference latency.

Keywords

Medical image analysis; generative modeling; endoscopic 3D reconstruction; disparity estimation; surgical navigation

Cite This Article

APA Style
Xu, G., Xu, S., Lu, S., Liu, Y., Yang, B. et al. (2025). Encoder-Guided Latent Space Search Based on Generative Networks for Stereo Disparity Estimation in Surgical Imaging. Computer Modeling in Engineering & Sciences, 145(3), 4037–4053. https://doi.org/10.32604/cmes.2025.074901
Vancouver Style
Xu G, Xu S, Lu S, Liu Y, Yang B, Lyu J, et al. Encoder-Guided Latent Space Search Based on Generative Networks for Stereo Disparity Estimation in Surgical Imaging. Comput Model Eng Sci. 2025;145(3):4037–4053. https://doi.org/10.32604/cmes.2025.074901
IEEE Style
G. Xu et al., “Encoder-Guided Latent Space Search Based on Generative Networks for Stereo Disparity Estimation in Surgical Imaging,” Comput. Model. Eng. Sci., vol. 145, no. 3, pp. 4037–4053, 2025. https://doi.org/10.32604/cmes.2025.074901



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 16

    View

  • 7

    Download

  • 0

    Like

Share Link