Open Access
ARTICLE
Curriculum-Learning-Guided Multi-Agent Deep Reinforcement Learning for N-1 Static Security Prevention and Control
Ximing Zhang1,*, Zhuohuan Li2, Xuexia Quan1, Kai Cheng2, Yang Yu2
1 China Southern Power Grid Co., Ltd., Guangzhou, 510700, China
2 Digital Grid Research Institute Co., Ltd., China Southern Power Grid, Guangzhou, 510663, China
* Corresponding Author: Ximing Zhang. Email:
(This article belongs to the Special Issue: Digital and Intelligent Planning and Operation Technologies for Flexible Distribution Network)
Energy Engineering https://doi.org/10.32604/ee.2025.073912
Received 28 September 2025; Accepted 21 November 2025; Published online 29 December 2025
Abstract
The “N-1” criterion represents a fundamental principle for assessing the reliability of power systems in static security analysis. Existing studies mainly rely on centralized single-agent reinforcement learning frameworks, where centralized control is difficult to cope with regional autonomy and communication delays. In high-dimensional state–action spaces, these approaches often suffer from low efficiency and unstable policies, limiting their applicability to large-scale grids. To address these issues, this paper proposes a Multi-Agent Deep Reinforcement Learning (MADRL) method enhanced with Curriculum Learning (CL) and Prioritized Experience Replay (PER). The proposed framework adopts a Centralized Training with Decentralized Execution (CTDE) paradigm, where independent agents are assigned to different system regions to enable autonomous decision-making and interregional coordination. In addition, the Actor–Critic (AC) architecture is refined with optimized value update rules to mitigate Q-value overestimation. A curriculum learning mechanism based on source–load fluctuation intensity further guides agents from simple to complex operating conditions, enhancing convergence and policy robustness. Simulation results on the IEEE 39-bus system demonstrate that the proposed method efficiently generates coordinated multi-region control strategies, eliminates voltage and current violations under N-1 contingencies, and consistently outperforms the baseline MADRL approach in terms of decision performance and robustness under fluctuating source–load scenarios.
Keywords
Multi-agent deep reinforcement learning; static security analysis; preventive control; curriculum learning; N-1 guidelines