Open Access iconOpen Access

ARTICLE

Hierarchical Attention Transformer for Multivariate Time Series Forecasting

Qi Wang, Kelvin Amos Nicodemas*

School of Computer Science, Nanjing University of Information Science and Technology, Nanjing, China

* Corresponding Author: Kelvin Amos Nicodemas. Email: email

(This article belongs to the Special Issue: Advances in Time Series Analysis, Modelling and Forecasting)

Computers, Materials & Continua 2026, 87(2), 78 https://doi.org/10.32604/cmc.2026.074305

Abstract

Multivariate time series forecasting plays a crucial role in decision-making for systems like energy grids and transportation networks, where temporal patterns emerge across diverse scales from short-term fluctuations to long-term trends. However, existing Transformer-based methods often process data at a single resolution or handle multiple scales independently, overlooking critical cross-scale interactions that influence prediction accuracy. To address this gap, we introduce the Hierarchical Attention Transformer (HAT), which enables direct information exchange between temporal hierarchies through a novel cross-scale attention mechanism. HAT extracts multi-scale features using hierarchical convolutional-recurrent blocks, fuses them via temperature-controlled mechanisms, and optimizes gradient flow with residual connections for stable training. Evaluations on eight benchmark datasets show HAT outperforming state-of-the-art baselines, with average reductions of 8.2% in MSE and 7.5% in MAE across horizons, while achieving a 6.1× training speedup over patch-based methods. These advancements highlight HAT’s potential for applications requiring multi-resolution temporal modeling.

Keywords

Time series forecasting; multi-scale temporal modeling; cross-scale attention; transformer architecture; hierarchical embeddings; gradient flow optimization

Cite This Article

APA Style
Wang, Q., Nicodemas, K.A. (2026). Hierarchical Attention Transformer for Multivariate Time Series Forecasting. Computers, Materials & Continua, 87(2), 78. https://doi.org/10.32604/cmc.2026.074305
Vancouver Style
Wang Q, Nicodemas KA. Hierarchical Attention Transformer for Multivariate Time Series Forecasting. Comput Mater Contin. 2026;87(2):78. https://doi.org/10.32604/cmc.2026.074305
IEEE Style
Q. Wang and K. A. Nicodemas, “Hierarchical Attention Transformer for Multivariate Time Series Forecasting,” Comput. Mater. Contin., vol. 87, no. 2, pp. 78, 2026. https://doi.org/10.32604/cmc.2026.074305



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1242

    View

  • 691

    Download

  • 0

    Like

Share Link