Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.072881
Special Issues
Table of Content

Open Access

ARTICLE

Keyword Spotting Based on Dual-Branch Broadcast Residual and Time-Frequency Coordinate Attention

Zeyu Wang1, Jian-Hong Wang1,*, Kuo-Chun Hsu2,*
1 School of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China
2 Department of Information Management, National Taipei University of Business, Taipei, 10051, Taiwan
* Corresponding Author: Jian-Hong Wang. Email: email; Kuo-Chun Hsu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.072881

Received 05 September 2025; Accepted 27 November 2025; Published online 22 December 2025

Abstract

In daily life, keyword spotting plays an important role in human-computer interaction. However, noise often interferes with the extraction of time-frequency information, and achieving both computational efficiency and recognition accuracy on resource-constrained devices such as mobile terminals remains a major challenge. To address this, we propose a novel time-frequency dual-branch parallel residual network, which integrates a Dual-Branch Broadcast Residual module and a Time-Frequency Coordinate Attention module. The time-domain and frequency-domain branches are designed in parallel to independently extract temporal and spectral features, effectively avoiding the potential information loss caused by serial stacking, while enhancing information flow and multi-scale feature fusion. In terms of training strategy, a curriculum learning approach is introduced to progressively improve model robustness from easy to difficult tasks. Experimental results demonstrate that the proposed method consistently outperforms existing lightweight models under various signal-to-noise ratio (SNR) conditions, achieving superior far-field recognition performance on the Google Speech Commands V2 dataset. Notably, the model maintains stable performance even in low-SNR environments such as –10 dB, and generalizes well to unseen SNR conditions during training, validating its robustness to novel noise scenarios. Furthermore, the proposed model exhibits significantly fewer parameters, making it highly suitable for deployment on resource-limited devices. Overall, the model achieves a favorable balance between performance and parameter efficiency, demonstrating strong potential for practical applications.

Keywords

Keyword spotting; convolutional neural network; residual learning; attention; small footprint; noisy far-field
  • 166

    View

  • 30

    Download

  • 0

    Like

Share Link