Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.075756
Special Issues
Table of Content

Open Access

ARTICLE

A Semantic-Guided State-Space Learning Framework for Low-Light Image Enhancement

Xi Cai, Xiaoqiang Wang, Huiying Zhao, Guang Han*
Hebei Key Laboratory of Marine Perception Network and Data Processing, School of Computer and Communication Engineering, Northeastern University at Qinhuangdao, Qinhuangdao, 066004, China
* Corresponding Author: Guang Han. Email: email
(This article belongs to the Special Issue: Development and Application of Deep Learning and Image Processing)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.075756

Received 07 November 2025; Accepted 19 December 2025; Published online 19 January 2026

Abstract

Low-light image enhancement (LLIE) remains challenging due to underexposure, color distortion, and amplified noise introduced during illumination correction. Existing deep learning–based methods typically apply uniform enhancement across the entire image, which overlooks scene semantics and often leads to texture degradation or unnatural color reproduction. To overcome these limitations, we propose a Semantic-Guided Visual Mamba Network (SGVMNet) that unifies semantic reasoning, state-space modeling, and mixture-of-experts routing for adaptive illumination correction. SGVMNet comprises three key components: (1) a semantic modulation module (SMM) that extracts scene-aware semantic priors from pretrained multimodal models—Large Language and Vision Assistant (LLaVA) and Contrastive Language–Image Pretraining (CLIP)—and injects them hierarchically into the feature stream; (2) a Mixture-of-Experts State-Space Feature Enhancement Module (MoE-SSMFEM) that dynamically selects informative channels and activates specialized state-space experts for efficient global–local illumination modeling; and (3) a Text-Guided Mixture Mamba Block (TGMB) that fuses semantic priors and visual features through bidirectional state propagation. Experimental results demonstrate that on the low-light (LOL) dataset, SGVMNet outperforms other state-of-the-art methods in both quantitative and qualitative evaluations, and it also maintains low computational complexity with fast inference speed. On LOLv2-Syn, SGVMNet achieves 26.512 dB PSNR and 0.935 SSIM, outperforming RetinexFormer by 0.61 dB. On LOLv1, SGVMNet attains 26.50 dB PSNR and 0.863 SSIM. Furthermore, experiments on multiple unpaired real-world datasets further validate the superiority of SGVMNet, showing that the model not only exhibits strong cross-scene generalization ability but also effectively preserves semantic consistency and visual naturalness.

Keywords

Noise interference; attention mechanism; Vision Mamba; semantic modulation; low-light image enhancement
  • 57

    View

  • 13

    Download

  • 0

    Like

Share Link