Open Access
ARTICLE
A Semantic-Guided State-Space Learning Framework for Low-Light Image Enhancement
Hebei Key Laboratory of Marine Perception Network and Data Processing, School of Computer and Communication Engineering, Northeastern University at Qinhuangdao, Qinhuangdao, 066004, China
* Corresponding Author: Guang Han. Email:
(This article belongs to the Special Issue: Development and Application of Deep Learning and Image Processing)
Computers, Materials & Continua 2026, 87(2), 48 https://doi.org/10.32604/cmc.2026.075756
Received 07 November 2025; Accepted 19 December 2025; Issue published 12 March 2026
Abstract
Low-light image enhancement (LLIE) remains challenging due to underexposure, color distortion, and amplified noise introduced during illumination correction. Existing deep learning–based methods typically apply uniform enhancement across the entire image, which overlooks scene semantics and often leads to texture degradation or unnatural color reproduction. To overcome these limitations, we propose a Semantic-Guided Visual Mamba Network (SGVMNet) that unifies semantic reasoning, state-space modeling, and mixture-of-experts routing for adaptive illumination correction. SGVMNet comprises three key components: (1) a semantic modulation module (SMM) that extracts scene-aware semantic priors from pretrained multimodal models—Large Language and Vision Assistant (LLaVA) and Contrastive Language–Image Pretraining (CLIP)—and injects them hierarchically into the feature stream; (2) a Mixture-of-Experts State-Space Feature Enhancement Module (MoE-SSMFEM) that dynamically selects informative channels and activates specialized state-space experts for efficient global–local illumination modeling; and (3) a Text-Guided Mixture Mamba Block (TGMB) that fuses semantic priors and visual features through bidirectional state propagation. Experimental results demonstrate that on the low-light (LOL) dataset, SGVMNet outperforms other state-of-the-art methods in both quantitative and qualitative evaluations, and it also maintains low computational complexity with fast inference speed. On LOLv2-Syn, SGVMNet achieves 26.512 dB PSNR and 0.935 SSIM, outperforming RetinexFormer by 0.61 dB. On LOLv1, SGVMNet attains 26.50 dB PSNR and 0.863 SSIM. Furthermore, experiments on multiple unpaired real-world datasets further validate the superiority of SGVMNet, showing that the model not only exhibits strong cross-scene generalization ability but also effectively preserves semantic consistency and visual naturalness.Keywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools