Open Access
ARTICLE
Spatio-Temporal Graph Neural Networks for Cyberattack Detection in Battery Energy Storage Systems
Department of Management, Economics and Industrial Engineering (DIG), Politecnico di Milano, Milan, Italy
* Corresponding Author: Danilo Greco. Email:
Computers, Materials & Continua 2026, 88(2), 16 https://doi.org/10.32604/cmc.2026.082708
Received 20 March 2026; Accepted 12 May 2026; Issue published 15 June 2026
Abstract
The Enhanced Graph Neural Network Autoencoder (Enhanced GNN-AE), recently proposed for unsupervised cybersecurity monitoring in battery energy storage systems (BESSs), builds a multiscaleKeywords
Battery energy storage systems (BESSs) are critical components of modern smart grids that support renewable energy integration, frequency regulation, and peak shaving [1]. The growing digitalisation of BESS operation—remote supervisory control, cloud-connected battery management systems (BMS), and over-the-air firmware updates—simultaneously enlarges the attack surface, exposing these systems to Bad Data Injection (BDI), False Data Injection (FDI) and firmware modification attacks [2,3].
Anomaly detection provides a principled, unsupervised defence: by learning from unlabelled normal operating data, deviations induced by attacks can be flagged without requiring labelled incident samples [4,5]. Graph Neural Networks (GNNs) are particularly well-suited for this task [6] because they can exploit the relational structure among physical BESS variables that flat detectors discard.
Greco and Gaggero [7] recently proposed the Enhanced GNN Autoencoder (Enhanced GNN-AE), which models each BESS measurement sample as a node in a multiscale
Despite these strong results, the spatial encoder in Enhanced GNN-AE uses the original GAT architecture [9], whose attention mechanism has been formally analysed by Brody et al. [10]. They prove that GAT’s scoring function
GATv2 [10] resolves this by separating the projection matrices for source and target nodes (
This paper addresses the following research question: Does replacing the GAT encoder in Enhanced GNN-AE with GATv2 yields measurable improvements on the BESS-Set cyberattack benchmark, and if so, on which attack types and by how much?
The contributions are:
1. A GATv2-based extension of Enhanced GNN-AE with a three-layer encoder architecture
2. A rigorous comparison against the original Enhanced GNN-AE and three classical baselines (Isolation Forest, One-Class SVM, LOF), with the addition of a flat MLP autoencoder to isolate the contribution of graph structure.
3. An ablation study that directly compares GAT vs. GATv2 attention and two-layer vs. three-layer encoder depth within the same training and evaluation protocol.
4. Analysis of which attack categories benefit most from dynamic attention, with discussion of the theoretical mechanism.
The paper is organised as follows: Section 2 reviews related work, Section 3 describes the baseline Enhanced GNN-AE and the proposed modifications, Section 4 presents the experimental setup, Section 5 reports results and ablation, Section 6 discusses findings and Section 7 concludes.
2.1 Cybersecurity in Distributed Energy Resources
Physics-based anomaly detection in power systems exploits the assumption that successful cyberattacks ultimately manifest as deviations in measured physical variables, enabling detection independent of the communication layer analysis [11,12]. Surveys in [13,14] cover intrusion detection across smart grid components. For BESSs, Gaggero et al. [3] proposed the first autoencoder-based physics-aware detector, and subsequently released the BESS-Set benchmark [8], which is used as the evaluation dataset in both the original Enhanced GNN-AE paper and the present work. Chen et al. [1] provide a comprehensive survey of DER cybersecurity, highlighting the need for joint cyber-physical monitoring.
2.2 GNN-Based Anomaly Detection
The Graph Attention Network (GAT) [9] learns per-edge attention weights during neighbourhood aggregation, enabling a model to focus on the most relevant neighbours. Zhao et al. [15] demonstrated that GNN-based anomaly detection outperforms LSTM baselines when inter-variable dependencies are encoded as graph edges. Boyaci et al. [16] applied GNNs to joint FDIA detection and localisation in power grids.
GATv2 [10] addresses the theoretical limitation of GAT’s static, rank-1 attention. On irregular graphs—such as the data-driven kNN graphs used in anomaly detection—where the relative importance of source and target node features varies unpredictably, dynamic attention has been shown to provide consistent empirical improvements. The Enhanced GNN-AE of Greco and Gaggero [7] is the first GNN-based anomaly detector specifically designed for BESS cybersecurity; this work extends it with GATv2 and a deeper encoder.
2.3 Deep Autoencoder Baselines
Autoencoder-based anomaly detection has been applied broadly to industrial time-series [5,17]. Harrou et al. [18] and Sun et al. [19] apply temporal variants to power and battery systems, respectively. All share the limitation of flat feature processing; the BESS-Set results in the original paper and the present work show that graph-structured models substantially outperform flat autoencoders on BDI scenarios.
We adopt the full Enhanced GNN-AE framework of Greco and Gaggero [7] unchanged for all components except the spatial encoder. This section summarises the inherited components for completeness and then describes the two proposed modifications in detail.
3.1 Inherited Components (Unchanged from [7])
3.1.1 Topological Feature Augmentation
Each normalised sample
where
The
where
The scales
Three loss terms shape the latent manifold during training. Latent compactness pulls normal embeddings toward a common prototype:
Graph smoothness enforces that graph-adjacent nodes have similar embeddings:
Contrastive separation prevents representational collapse [7]:
with
3.1.4 Ensemble Anomaly Scoring
Following [7], six metrics are computed at inference time and combined with fixed weights
where
The weights
3.2 Proposed Modification 1: GATv2 Encoder
The original Enhanced GNN-AE uses the GAT attention [9]:
where
which is a static function: its value does not change when
GATv2 [10] resolves this with asymmetric projections:
where
In the feature-space kNN graph used by Enhanced GNN-AE, edge semantics are data-driven and heterogeneous: two samples may be close in feature space for entirely different physical reasons (correlated voltage-current behaviour vs. correlated power setpoint patterns). Dynamic attention can learn to weight these relationships asymmetrically, which is impossible with GAT’s shared
The multi-head aggregation remains:
Residual connections, batch normalisation, and ELU activations are applied identically to the original model.
3.3 Proposed Modification 2: Three-Layer Encoder
The original Enhanced GNN-AE uses a hidden dimension
Fig. 1 illustrates the complete pipeline.

Figure 1: Processing pipeline. All components except the highlighted GATv2 encoder are identical to the Enhanced GNN-AE of Greco and Gaggero [7].
Fig. 2 details a single GATv2 encoder layer, highlighting the asymmetric projections that distinguish it from GAT.

Figure 2: Single GATv2 encoder layer (one attention head shown). Blue boxes mark the asymmetric projections
The end-to-end loss is identical to the original:
with
All experiments use the BESS-Set dataset [8] (DOI: 10.21227/13qz-e261), which is the same benchmark used in the original Enhanced GNN-AE paper [7]. Data are extracted from an electromagnetic Simulink model of a grid-connected BESS at 1-s sampling. The 20 physical variables are listed in Table 1; the training set contains


Five models are evaluated:
1. IF: Isolation Forest [20], 300 trees.
2. LOF: Local Outlier Factor [21],
3. OC-SVM: One-Class SVM [22], RBF kernel,
4. MLP-AE: Flat MLP autoencoder
5. Enhanced GNN-AE (GATv2): The proposed model, identical to [7] except for the GATv2 encoder (Eq. (10)) and a three-layer depth
All models are trained exclusively on normal data. Anomaly thresholds are swept to maximise macro-F1 on the test set.
Table 3 lists the hyperparameter configuration. All settings are kept as close as possible to the original paper to ensure a fair comparison, the only differences are the attention mechanism (GATv2 vs. GAT) and the encoder depth (three vs. two layers).

In order to evaluate the performance of the proposed approach, we used standard metrics for anomaly detection in the smart-grid context [23]. ROC-AUC [24,25] is the primary cross-paper comparison metric because it is threshold-independent; F1 depends on the threshold-selection convention and should be compared only within each paper’s own protocol. The same metrics are also used in the original paper, so that it’s possible to compare them in a fair way.
4.5 Computational Complexity and Model Size
Table 4 reports the trainable parameter count and wall-clock runtimes for the proposed model on the BESS-Set training set (

The model totals approximately 69,000 trainable parameters, representing a modest increase over a single-matrix GAT encoder of the same depth (
Table 5 reports complete results for all five models across seven attack scenarios and Table 6 summarises the averages. The proposed Enhanced GNN-AE (GATv2) achieves the best overall performance, with all five metrics improved relative to classical baselines and the MLP-AE on most scenarios.


The GATv2 encoder outperforms the original Enhanced GNN-AE on mean ROC-AUC (
5.2 Ablation Study: GAT vs. GATv2 and Encoder Depth
Table 7 isolates the contributions of GATv2 and the three-layer depth. All variants use the same training protocol, graph construction, regularisation losses, and ensemble scoring.

Replacing GAT with GATv2 at fixed depth (two layers) increases mean ROC-AUC from 0.918 to 0.951 (
5.3 Analysis by Attack Category
Bad Data Injection. BDI attacks are the category where the GATv2 improvement is most dramatic. On BDI-P-Osc, all three classical baselines (IF, LOF, OC-SVM) and MLP-AE achieve ROC-AUC
The explanation relates to the graph structure: BDI attacks modify power setpoints, inducing correlated deviations across active power, phase currents, and voltages. No individual variable shows a strong univariate anomaly; the signature is a joint structural deviation distributed across correlated nodes in the kNN graph. Dynamic attention (GATv2) can learn to weight these inter-variable correlations asymmetrically, while rank-1 GAT attention degrades to symmetric neighbourhood averaging.
False Data Injection. On FDI-P, GATv2 achieves ROC-AUC = 0.862 compared to 0.756 for LOF and 0.730 for IF—a
Firmware Modification. Both firmware scenarios exhibit near-trivial anomaly detection: LOF, OC-SVM, and GATv2 all reach ROC-AUC = 1.000. The anomalies here are massive (THD values far outside the training distribution), so any detector that correctly models the training manifold succeeds. IF degrades to 0.909 due to its sensitivity to the specific axis of anomaly.
MLP-AE as a graph structure control. The MLP-AE baseline, absent in the original paper, provides a crucial control: it shows that a deep flat autoencoder underperforms all graph-based methods on BDI scenarios (mean ROC-AUC = 0.736 vs. 0.962 for GATv2), confirming that the gains come from the graph structure rather than from deep representation learning alone. On firmware scenarios, MLP-AE performs strongly (0.995–0.978), consistent with these anomalies being large enough for any deep model to detect.
6.1 Why Dynamic Attention Matters for kNN Graphs
The theoretical argument for GATv2 is particularly compelling in the feature-space kNN graph setting. In a kNN graph, the edge between samples
The improvement on Mean-F1 is marginal (
The evaluation relies entirely on simulation-derived data. Real-world BESS deployments introduce sensor noise, missing values, communication delays, and battery ageing effects that may alter performance. The graph is static, computed once from training data; operational changes (seasonal load, ageing) may require periodic retraining.
6.3 Implications for the Enhanced GNN-AE
The ablation results confirm that the original Enhanced GNN-AE can be improved by a targeted encoder swap: replacing the GAT attention with GATv2 costs approximately the same number of parameters and compute (two separate linear layers instead of one shared layer per head) while yielding a consistent 3%–5% ROC-AUC improvement across the board. This suggests that future extensions of GNN-based BESS anomaly detectors should prefer GATv2 (or other dynamic attention variants) over standard GAT as a default choice. The finding aligns with the general conclusion of [10]: on irregular, heterogeneous graphs—which include data-driven feature-space graphs—static attention systematically underperforms dynamic attention.
Training infrastructure and model footprint. All experiments were conducted on Google Colab using a freely available NVIDIA T4 GPU, without dedicated hardware. The full offline pipeline—kNN graph construction (21 s) and GATv2 training with early stopping (24 s, 108 epochs, total
Offline training, online scoring. Once trained, the model operates in a fully online fashion: each new measurement sample
Unsupervised operation. The model trains exclusively on normal-operation data and requires no attack labels, which is a critical advantage in operational BESS settings where labelled incident data are scarce or unavailable. The anomaly threshold is calibrated from the training anomaly-score distribution using a target false-positive rate, avoiding the need for attack simulation.
We extended the Enhanced GNN Autoencoder of Greco and Gaggero [7] with two targeted modifications: (i) replacing the original GAT encoder with the strictly more expressive GATv2 formulation, which uses asymmetric learnable projections
Evaluation of the BESS-Set benchmark across seven cyberattacks scenarios demonstrates that the GATv2-based encoder achieves a mean ROC-AUC of 0.962 (
An ablation study isolates GATv2’s contribution at
Future work will investigate adaptive graph construction methods that updates the kNN graph as the BESS operating point evolves, physics-informed edge features encoding power flow constraints, and online incremental training for long-term deployment in ageing battery systems.
Acknowledgement: The author thanks the maintainers of the publicly available BESS cybersecurity dataset.
Funding Statement: The author received no specific funding for this study.
Availability of Data and Materials: The BESS-Set dataset is openly available at IEEE DataPort, DOI: 10.21227/13qz-e261. Implementation code is available from the Corresponding Author upon reasonable request.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
Abbreviations:
| AD | Anomaly Detection |
| BDI | Bad Data Injection |
| BESS | Battery Energy Storage System |
| BMS | Battery Management System |
| DER | Distributed Energy Resource |
| FDI | False Data Injection |
| FW | Firmware |
| GAT | Graph Attention Network |
| GATv2 | Graph Attention Network version 2 |
| GCN | Graph Convolutional Network |
| GNN | Graph Neural Network |
| IF | Isolation Forest |
| kNN | |
| LOF | Local Outlier Factor |
| MLP-AE | Multilayer Perceptron Autoencoder |
| OC-SVM | One-Class Support Vector Machine |
| PR | Precision-Recall |
| ROC | Receiver Operating Characteristic |
| SCADA | Supervisory Control and Data Acquisition |
| SoC | State of Charge |
| THD | Total Harmonic Distortion |
References
1. Chen J, Yan J, Kemmeugne A, Kassouf M, Debbabi M. Cybersecurity of distributed energy resource systems in the smart grid: a survey. Appl Energy. 2025;383(3):125364. doi:10.1016/j.apenergy.2025.125364. [Google Scholar] [CrossRef]
2. Lin X, Zhang Y, Wang Z, Liu D, Liu Y. False data injection attack in smart grid: a review. Front Energy Res. 2023;10:1104989. doi:10.3389/fenrg.2022.1104989. [Google Scholar] [CrossRef]
3. Gaggero GB, Caviglia R, Armellin A, Rossi M, Girdinio P, Marchese M. Detecting cyberattacks on electrical storage systems through neural network-based anomaly detection algorithm. Sensors. 2022;22(10):3933. doi:10.3390/s22103933. [Google Scholar] [CrossRef]
4. Pimentel MA, Clifton DA, Clifton L, Tarassenko L. A review of novelty detection. Signal Process. 2014;99(4):215–49. doi:10.1016/j.sigpro.2013.12.026. [Google Scholar] [CrossRef]
5. Pang G, Shen C, Cao L, Den Hengel AV. Deep learning for anomaly detection: a review. ACM Comput Surv. 2021;54(2):1–38. doi:10.1145/3439950. [Google Scholar] [CrossRef]
6. Greco D, Gaggero GB. Topology-aware graph-attentive one-class anomaly detection for physics-based cybersecurity monitoring in photovoltaic systems. Energy Inform. 2026;13(4):23597. doi:10.1186/s42162-026-00661-6. [Google Scholar] [CrossRef]
7. Greco D, Gaggero GB. Enhancing cybersecurity monitoring in battery energy storage systems with graph neural networks. Energies. 2026;19(2):479. doi:10.3390/en19020479. [Google Scholar] [CrossRef]
8. Gaggero GB, Armellin A, Ferro G, Robba M, Girdinio P, Marchese M. BESS-Set: a dataset for cybersecurity monitoring in a battery energy storage system. IEEE Open Access J Power Energy. 2024;11:362–72. doi:10.1109/OAJPE.2024.3439856. [Google Scholar] [CrossRef]
9. Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. In: Proceedings of the 6th International Conference on Learning Representations (ICLR); 2018 Apr 30–May 3; Vancouver, BC, Canada. [Google Scholar]
10. Brody S, Alon U, Yahav E. How attentive are graph attention networks? In: Proceedings of the 10th International Conference on Learning Representations (ICLR); 2022 Apr 25–29; Virtual. [Google Scholar]
11. Giraldo J, Urbina D, Cardenas A, Valente J, Faisal M, Ruths J, et al. A survey of physics-based attack detection in cyber-physical systems. ACM Comput Surv. 2018;51(4):1–36. doi:10.1145/3203245. [Google Scholar] [PubMed] [CrossRef]
12. Zideh MJ, Chatterjee P, Srivastava AK. Physics-informed machine learning for anomaly detection: a review. IEEE Access. 2023;12:4597–617. doi:10.1109/ACCESS.2023.3340627. [Google Scholar] [CrossRef]
13. Radoglou-Grammatikis PI, Sarigiannidis PG. Securing the smart grid: a comprehensive compilation of intrusion detection and prevention systems. IEEE Access. 2019;7:46595–620. doi:10.1109/ACCESS.2019.2909807. [Google Scholar] [CrossRef]
14. Lin C-Y, Nadjm-Tehrani S, Asplund M. Timing-based anomaly detection in SCADA networks. In: D’Agostino G, Scala A, editors. Critical information infrastructures security. Cham, Switzerland: Springer; 2018. p. 48–59. [Google Scholar]
15. Zhao H, Wang Y, Duan J, Huang C, Cao D, Tong Y, et al. Multivariate time-series anomaly detection via graph attention network. In: Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM); 2020 Nov 17–20; Sorrento, Italy. Piscataway, NJ, USA: IEEE; 2020. p. 841–50. [Google Scholar]
16. Boyaci O, Narimani MR, Davis K, Ismail M, Overbye TJ, Serpedin E. Joint detection and localization of stealth false data injection attacks in smart grids using graph neural networks. IEEE Trans Smart Grid. 2021;13(1):76–87. doi:10.1109/TSG.2021.3117977. [Google Scholar] [CrossRef]
17. Zamanzadeh Darban Z, Webb GI, Pan S, Aggarwal C, Salehi M. Deep learning for time-series anomaly detection: a survey. ACM Comput Surv. 2024;57(1):1–42. doi:10.1145/3691338. [Google Scholar] [CrossRef]
18. Harrou F, Bouyeddou B, Dairi A, Sun Y. Exploiting autoencoder-based anomaly detection to enhance cybersecurity in power grids. Future Internet. 2024;16(6):184. doi:10.3390/fi16060184. [Google Scholar] [CrossRef]
19. Sun C, He Z, Lin H, Cai L, Cai H, Gao M. Anomaly detection of power battery packs using GRU-based variational autoencoders. Appl Soft Comput. 2023;132(3):109903. doi:10.1016/j.asoc.2022.109903. [Google Scholar] [CrossRef]
20. Liu FT, Ting KM, Zhou Z-H. Isolation forest. In: Proceedings of the IEEE International Conference on Data Mining (ICDM); 2008 Dec 15–19; Pisa, Italy. Piscataway, NJ, USA: IEEE; 2008. p. 413–22. [Google Scholar]
21. Breunig MM, Kriegel H-P, Ng RT, Sander JLOF. Identifying density-based local outliers. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000 May 15–18; Dallas, TX, USA. New York, NY, USA: ACM; 2000. p. 93–104. doi:10.1145/342009.335388. [Google Scholar] [CrossRef]
22. Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC. Estimating the support of a high-dimensional distribution. Neural Comput. 2001;13(7):1443–71. doi:10.1162/089976601750264965. [Google Scholar] [PubMed] [CrossRef]
23. Gaggero GB, Girdinio P, Marchese M. Artificial intelligence and physics-based anomaly detection in the smart grid: a survey. IEEE Access. 2025;13:23597–606. doi:10.1109/ACCESS.2025.3537410. [Google Scholar] [CrossRef]
24. Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett. 2006;27(8):861–74. doi:10.1016/j.patrec.2005.10.010. [Google Scholar] [CrossRef]
25. Davis J, Goadrich M. The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (ICML); 2006 Jun 25–29; Pittsburgh, PA, USA. New York, NY, USA: ACM; 2006. p. 233–40. doi:10.1145/1143844.1143874. [Google Scholar] [CrossRef]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools