Open Access
ARTICLE
Multi-Timescale Optimization Scheduling of Distribution Networks Based on the Uncertainty Intervals in Source-Load Forecasting
Key Laboratory of Modern Power System Simulation and Control & Renewable Energy Technology, Ministry of Education (Northeast Electric Power University), Jilin, 132012, China
* Corresponding Author: Shiqiang Li. Email:
(This article belongs to the Special Issue: Advances in Renewable Energy Systems: Integrating Machine Learning for Enhanced Efficiency and Optimization)
Energy Engineering 2025, 122(6), 2417-2448. https://doi.org/10.32604/ee.2025.061214
Received 19 November 2024; Accepted 18 April 2025; Issue published 29 May 2025
Abstract
With the increasing integration of large-scale distributed energy resources into the grid, traditional distribution network optimization and dispatch methods struggle to address the challenges posed by both generation and load. Accounting for these issues, this paper proposes a multi-timescale coordinated optimization dispatch method for distribution networks. First, the probability box theory was employed to determine the uncertainty intervals of generation and load forecasts, based on which, the requirements for flexibility dispatch and capacity constraints of the grid were calculated and analyzed. Subsequently, a multi-timescale optimization framework was constructed, incorporating the generation and load forecast uncertainties. This framework included optimization models for day-ahead scheduling, intra-day optimization, and real-time adjustments, aiming to meet flexibility needs across different timescales and improve the economic efficiency of the grid. Furthermore, an improved soft actor-critic algorithm was introduced to enhance the uncertainty exploration capability. Utilizing a centralized training and decentralized execution framework, a multi-agent SAC network model was developed to improve the decision-making efficiency of the agents. Finally, the effectiveness and superiority of the proposed method were validated using a modified IEEE-33 bus test system.Keywords
Distribution networks (DNs), as a key component of power systems, play a strategic role in energy transition and sustainable development [1]. Renewable energy-dominated power systems have evolved rapidly under the “dual carbon” goals of China, leading to increased penetration of renewable energy resources, flexible regulations, and increased diversity in power demand characteristics [2]. However, large-scale integration of renewable energy and flexible resources introduces significant uncertainties from generation and load perspectives, posing substantial challenges to the economic operation of power grids [3]. With the ongoing market-oriented reforms, dispatch strategy optimization, renewable energy utilization, and cost optimization have become critical issues for grid operators [4].
In renewable energy-based DN optimization studies, nonlinear programming can introduce complications, owing to large-scale decision variables, non-convex objective functions, and nonlinear constraints. These issues have been addressed by employing convex relaxation techniques, such as mixed-integer linear programming [5] and second-order cone programming [6], which reduce problem complexity through transformation but may compromise on the accuracy and global optimality of the solution. In recent years, distributed optimization algorithms have garnered increasing attention due to their flexibility and scalability. The alternating direction method of multipliers [7] enhances computational efficiency and ensures result accuracy through problem decomposition and node-level collaboration mechanisms, while consensus optimization [8] and Lagrangian dual relaxation [9] algorithms further improve overall performance and convergence speed by optimizing the iterative process. However, these approaches remain sub-optimal for upscaling due to their inability to lower computational costs and increase convergence rates, which are critical for real-time optimization.
Data-driven reinforcement learning (RL) techniques can identify complex dynamics by employing neural networks to capture multivariable and nonlinear relationships within a power system [10], thereby reducing the reliance on traditional power system models. Moreover, RL techniques can significantly enhance the computational efficiency and accuracy of power systems, providing a comprehensive solution for optimizing distributed renewable energy in modern DNs [11]. Researchers have employed various single-agent RL frameworks for optimizing DNs. For instance, the deep Q-network has been used for controlling grid voltage [12], the proximal policy optimization (PPO) algorithm has been used for scheduling time-coupled equipment (e.g., microturbines (MTs)) [13], and the deep deterministic policy gradient (DDPG) algorithm has been used for mitigating voltage limit violations caused by system uncertainties [14]. However, the limited scalability of these single-agent RL frameworks has prompted researchers to explore multi-agent (MA)-deep RL (DRL) frameworks, particularly those based on centralized training with decentralized execution (CTDE), which grants independent decision-making abilities to regulation equipment [15]. Li et al. [16] proposed a CTDE-based multi-agent DDPG (MADDPG) algorithm, which efficiently coordinates multiple distributed devices and addresses partial observability constraints, enhancing the flexibility and reliability of distribution systems; however, this algorithm shows limited adaptability under highly uncertain and dynamic environments. To address this limitation, Reference [17] introduced a multi-agent soft actor-critic (MASAC) algorithm, which enhances the ability of agents to cope with source-load uncertainty and complex power grids, thus significantly improving exploration capabilities, adaptability, and decision-making efficiency of distribution systems under complex environments. However, existing studies have not adequately addressed the impact of high uncertainty on the agent’s exploration capabilities and efficiency. These situations may expose agents to more unknown factors during the exploration process, affecting their learning speed and decision-making ability, and further restricting their ability to adapt and optimize performance in dynamic environments.
It is crucial to establish effective modeling and processing methods, such as scenario-based, chance-constrained programming, robust optimization, and interval optimization methods, to address the uncertainties in renewable energy-based DNs, particularly those arising from environmental and climate variations. Stochastic optimization methods generate multiple scenarios to simulate the impact of uncertain factors and provide their probability distribution in energy generation [18]. However, these methods struggle to balance computational efficiency with accuracy [19]. Robust optimization methods generate worst-case scenarios, based on the possible variations in uncertain factors, to ensure the feasibility of optimization strategies within the range of parameter perturbations [20]. However, these traditional methods primarily rely on boundary information (minimum and maximum values) of uncertain quantities, often leading to overly conservative optimization results [21]. Although studies have reduced the conservativeness of robust optimization through spatiotemporal correlation [22] and probability distribution [23], these improvements increase the complexity of the optimization model, limiting its application in large-scale systems. Interval optimization methods describe uncertain values as interval numbers and use interval arithmetic to determine the optimal interval [24]. Compared to stochastic optimization, interval optimization makes DNs efficient and less conservative. However, the effectiveness of interval optimization is closely associated with selecting the appropriate range of interval variables, which is crucial for achieving optimal results. In particular, obtaining reasonable interval variables is critical in scenarios with high source-load uncertainty.
Despite improving the economic efficiency and safety of power grid operations under source-load bilateral uncertainty, the existing DN optimization methods exhibit two significant limitations. First, the high uncertainty environment severely restricts the exploration efficiency and learning speed of optimization algorithms, leading to increased computational complexity and prolonged training cycles. Second, the existing methods fail to adequately consider the dynamic characteristics of source-load prediction errors across different timescales, resulting in overly conservative uncertainty intervals that reduce scheduling accuracy and efficiency.
In this study, we propose a multi-timescale optimal scheduling method that aims to enhance scheduling flexibility, accuracy, and computational efficiency by addressing the uncertainties in source-load prediction. The main contributions of this study are summarized as follows:
(1) The novel multi-timescale optimization scheduling method incorporates source-load forecast uncertainty intervals by analyzing the flexibility requirements at multiple timescales (day-ahead, intra-day, and real-time) to optimize grid operations, thereby ensuring economic efficiency.
(2) The improved SAC algorithm dynamically adjusts the weight of entropy in policy optimization through a self-regulating temperature coefficient and improves the agents by reducing unnecessary explorations.
(3) The CTDE-based MASAC framework effectively reduces the decision-making time in the optimization process, ensures real-time scheduling, and significantly improves the overall system efficiency.
2 Optimal Dispatch Strategy Based on the Uncertainty Intervals of Source-Load Forecast
2.1 Analysis of Source-Load Uncertainty and Grid Flexibility
The forecasting timescale plays a crucial role in determining the accuracy of source-load predictions while addressing uncertainty in DNs. The extent of uncertainty errors varied across different forecasting timescales, significantly affecting source-load predictions [25] (Fig. 1). Generally, day-ahead forecasting utilizes a 1-h timescale, which is influenced by various uncertainty factors, like long-term weather variations, equipment degradation or potential malfunctions, and consumer electricity demand variations, resulting in increased forecasting errors. Conversely, intra-day forecasting adopts <1-h timescales, which are influenced by short-term factors, such as transient weather changes, temporary equipment failures, and short-term demand fluctuations, resulting in enhanced accuracy and reduced uncertainty intervals. Real-time forecasting employs even shorter timescales (~5 min) to achieve superior precision and predictive confidence intervals.
Figure 1: Comparison of predicted data across different timescales
2.1.1 Characterization of Uncertainty Intervals in Source-Load Forecasting
The variability range of uncertain variables significantly influences the optimization performance of power grid systems. In this study, we utilized the probability box (p-box) theory to accurately delineate the appropriate uncertainty intervals for source-load forecasting [26].
(1) Gaussian distribution assumption
The source-load prediction error is often represented as a normal distribution, owing to the central limit theorem, which states that when errors arise from numerous independent random factors, their probability distribution tends to be normal. This assumption has been widely validated in load forecasting studies. The relative prediction error data for Liaoyang, China (2022) conformed to a normal distribution (Fig. 2a). The statistical histogram aligned well with the probability density and cumulative probability functions of the fitted normal distribution, indicating that load prediction errors can be represented as a normal distribution, laying the foundation for subsequent statistical analysis and model optimization. Additionally, when the sample size is sufficiently large, the prediction errors of wind or photovoltaic (PV) power outputs tend to conform to a normal distribution, despite the influence of various factors [27]. Although the prediction errors of wind power can exhibit various distribution types, like beta, Cauchy, or Laplace distributions, their overall concentration trend can be successfully captured by a normal distribution (Fig. 2b). Similarly, the prediction errors of PV power, influenced by the stochastic and independent variations in solar radiation intensity, can be conventionally represented by a normal distribution [28] (Fig. 2c). Based on the aforementioned theoretical and empirical analyses, this study utilizes a normal distribution model to characterize source-load prediction errors.
Figure 2: Distribution of source-load prediction errors
(2) P-box-based interval extraction
The p-box theory represents uncertainty as interval-based probability distributions, yielding precise and conservative estimates. Particularly in cases with limited data availability or expert input, the p-box theory provides a validated uncertainty range for extracting source-load intervals. It accurately characterizes uncertainty, manages nonlinear and complex system dynamics, and offers a reliable foundation for informed decision-making.
The interval characterization was performed as follows: For a given scheduling day, wind turbine (WT) and PV output, as well as flexible load (FL) demand was predicted using predictive algorithms. The resulting values, denoted as
with a confidence level of
After acquiring the forecasted values for PV, WT, and Load, the uncertainty intervals were assessed using the p-box theory. As indicated in [26], the net load (NL) was computed as follows:
2.1.2 Incorporating Flexibility Requirements for Scheduling Intervals
Flexibility demand highlights the need for an adaptable grid dispatch to address the uncertainties in the source and load forecasts of power systems. Conversely, flexibility supply represents the capacity and availability of resources within a power system to ensure adaptability. Adequate flexibility supply enables the maintenance of flexibility balance across various temporal, spatial, and scale dimensions, ensuring that flexibility always exceeds demand. In multi-timescale optimal dispatch, especially under forecast uncertainties, it is essential to anticipate and meet future flexibility demands during the preceding scheduling stage. By securing sufficient dispatch supply capacity or flexible reserve capacity at an earlier stage, the power system can maintain a flexibility balance and sustain supply-demand equilibrium, even amid significant fluctuations in source-load forecasts in subsequent phases.
The range of uncertain demand capacity in a power system can be evaluated by calculating the uncertainty intervals in source-load forecasting along with the predicted NL values. This aids in defining the extent of flexibility supply capacity. Meanwhile, the flexibility demand capacity can be evaluated by examining the changes in NL forecasts between preceding and current time points, along with their associated prediction intervals. The variation in the predicted NL was calculated using the following equation:
where
Since NL forecasts are inherently directional, the flexibility of the power system also follows a directional trend (Fig. 3), involving two key scenarios:
Figure 3: Flexibility demand diagram
The constraints on flexibility supply capacity can then be determined using Eq. (5) as
where a positive value indicates the potential to increase discharge capacity, while a negative value indicates the ability to reduce discharge and charge capacity.
When
2.2 Multi-Agent RL Algorithm Based on Uncertainty Intervals in Supply and Load
RL comprises three key components: the agent, environment, and reward. The agent interacts with its environment through trial and error, aiming to maximize the cumulative reward. RL tasks can be modeled using a Markov decision process, which is characterized by four elements: state space (S), action space (A), state transition probability (P), and reward space (R) [29].
In this study, we employed a CTDE framework to optimize the dispatch of DNs handling large datasets. This approach extends single-agent DRL algorithms to multi-agent environments, allowing multiple agents to optimize the active DNs, while improving computational efficiency [30].
The SAC algorithm was used to develop a control model for active DNs and establish a system operation control strategy. During training, the SAC algorithm designs policies by balancing expected returns with information entropy, thereby preventing the emergence of suboptimal global solutions [29]. This framework enables agents to comprehensively understand source-load uncertainties, enhancing the system’s ability to optimize globally and adapt to these uncertainties. Furthermore, this algorithm incorporates agents’ action entropy into the value function, thereby encouraging exploration and improving stability. The action entropy of the agents can be defined as
where
where
(1) Enhanced temperature coefficient update
The temperature coefficient adjusts with changes in environmental uncertainty, and its size significantly impacts the model’s capability to explore uncertainties, influencing exploration behavior and optimization efficiency during training. In this study, we proposed an automatic temperature coefficient adjustment method to refine uncertainty learning and minimize redundant exploration. This approach dynamically adjusts the agent’s temperature coefficient based on the level of environmental uncertainty. The update calculation for each step of the optimization process can be expressed as
The
(2) SAC network
SAC employs an actor network to model the action policy (
where
where
The policy parameters,
where
where
To ensure stable operation of the SAC algorithm after every critic network parameter adjustment, the target critic network was subjected to a soft update, as follows:
where
Figure 4: An optimized scheduling framework based on an improved SAC algorithm
2.2.2 Design of CTDE-Based MASAC Network Model
The multi-agent training framework utilizes the CTDE framework to enhance efficacy. To ensure that each agent remains stable during the training process, the critic network incorporates additional data from the observations and actions of other agents. However, during execution, the actor network makes decisions based solely on the agent’s private observations. The CTDE framework is illustrated in Fig. 5.
Figure 5: Training diagram for multi-agent RL
During the training process, the actions, states, and rewards of each agent are stored in a replay buffer (
Within an environment, each agent, m (
Each agent has an independent actor-critic network, generating distinct actions (
In the experience replay buffer (
2.3 Multi-Timescale Scheduling Framework Based on Source-Load Uncertainty Intervals
In this study, we employed a multi-time-scale optimization strategy for DNs to manage the fluctuations in source-load forecasting uncertainties across multiple timescales: day-ahead, intra-day, and real-time stages. An enhanced MASAC algorithm was employed to centrally train agents across different timescales, facilitating the development of an efficient DN optimization and control strategy. The training dataset was constructed based on the historical data, incorporating renewable energy and load data into the environmental model. The DN operational data served as the state input for the agents. The actions generated by the agents were transformed into control commands, and the rewards were calculated according to a predefined reward function. A comprehensive overview of the scheduling framework is depicted in Fig. 6.
Figure 6: Multi-timescale optimal scheduling framework
In the day-ahead stage, uncertainties primarily stem from deviations in medium- to long-term weather forecasts and random load variations. Based on historical prediction error data, the p-box theory can determine the potential ranges and probability distributions of prediction errors for PV power, wind power, and load. Since day-ahead forecasts are generally updated once per day, their error ranges remain relatively stable, encompassing uncertainties over an extended period. Consequently, day-ahead optimization can provide 24-h generation schedules and reserve capacities; however, its inability to capture short-term variations may lead to conservative and economically inefficient outcomes. Day-ahead optimization provides the basis for intra-day and real-time optimizations, ensuring the reliability and economic efficiency of the system over an extended period. The agent operates at a 1-h timescale, employing centralized optimization to maximize economic efficiency in daily power system operations. The uncertainty intervals of source-load forecasts are incorporated at this stage to generate 24-h dispatch strategies, including scheduling of distributed MTs, optimizing energy storage systems (ESSs), regulating flexible loads, and maintaining a degree of robustness.
In the intra-day stage, uncertainties primarily stem from short-term meteorological changes, such as cloud movement and wind speed fluctuations, as well as random load variations. As forecasts are updated every 15 min to 1 h at this stage, the p-box effectively captures short-term fluctuations to generate more accurate error intervals and probability distributions, enhancing uncertainty descriptions for rolling optimization. This ensures that generation schedules and reserve capacities are dynamically modified to optimize the output of controllable resources, like ESSs and MTs, based on uncertainties. Compared to the day-ahead stage, the intra-day stage has higher data update frequency and more flexible optimization strategies, which enhance the trade-off between economic efficiency and system reliability. Operating on a 15-min timescale, this approach employs offline centralized training and online decentralized execution for decision-making. It primarily revises day-ahead decisions based on intra-day power generation and load variations, generating 1-h dispatch strategies for MTs and ESSs and ensuring regional robustness and economic efficiency.
3 Multi-Timescale Optimal Control Model Based on the Uncertainty Intervals of Source-Load Forecasting
The multi-timescale optimization and control model, incorporating the uncertainty intervals of source-load forecasting, aims to reduce grid generation costs, enhance local renewable energy integration, and mitigate source-load power fluctuations. Additionally, the model integrates flexibility requirements arising from source-load forecasting uncertainties at each timescale to ensure an adequate capacity for flexibility.
3.1 Day-Ahead Economic Optimal Control Model
In the day-ahead stage, a centralized optimization strategy is utilized to achieve optimal DN scheduling and maximize economic efficiency for the grid’s 24-h operation. This method ensures the grid’s safety and economic efficiency by regulating MTs, distributed ESSs, and FLs on an hourly basis.
3.1.1 Objective Function of Day-Ahead Optimization
The latest economic optimization scheduling aims to reduce the daily DN operation costs by assuming the complete integration of WT and PV power generation. The optimization objective is defined as
where T represents the number of control periods within a regulation cycle and the terms
The constraints for the optimization of power system scheduling intervals are DN branch power flow constraints, power grid voltage magnitude constraints, MT output constraints, FL constraint, line transmission power constraint, and flexibility capacity constraint.
(1) Branch power flow constraints in DNs
where
(2) Voltage magnitude constraints in a power grid
where
(3) MT output constraints
where
(4) ESS charging/discharging power and capacity constraints
At any time, the ESS can be in a charging or discharging state at a specified maximum power capacity [31].
(5) FL constraint
The user side consists of various types of FLs, including interruptible loads and transferable loads. Interruptible loads can function as virtual power sources and can adopt control strategies similar to those used for distributed energy resources. Contrastingly, shiftable loads represent a specialized form of transferable loads, characterized by their ability to alter the timing of energy usage through price adjustments or incentive measures without reducing the total demand load. In this study, we examined shiftable loads to analyze FL regulation strategies in DNs. For this, we considered the invariance of the total load over the control cycle as the principal constraint, expressed as
Additionally, during each control period, the adjustable capacity of transferable loads was subject to certain constraints, defined by upper and lower power limits.
(6) Line transmission power constraint
(7) Flexibility capacity constraint
In the day-ahead stage, the scheduling strategy is adjusted using distributed MTs and ESSs. Therefore, the reserve capacity during this phase is derived from the reserve capacities of MTs and ESSs, expressed as
where
where
3.1.3 Day-Ahead Scheduling Model
(1) State space
The state space (S) for the k-th agent within region k was constructed by selecting the active power output of k MT units (
(2) Action space
Since day-ahead regulation targets the ESSs, distributed generators, FLs, and PV and WT power system inverters, its action space (A) is defined as follows:
(3) Reward function
The SAC policy network aims to derive strategies that maximize the reward. Therefore, to minimize the objective function, the reward values (R) were set as the negative of the original objective function, as follows:
3.2 Intra-Day Optimization and Control Model
Active DNs are divided into several autonomous regions based on geographical conditions and regional consumption patterns. Each autonomous region is equipped with a regional agent that issues dispatch instructions to the controllable devices within its area. These agents utilize ultra-short-term forecasts of renewable energy generation, FL, and device status within the region to optimize their operations [32].
The regional agents employ CTDE-based neural networks, wherein the network parameters for each agent are obtained through centralized training, following which the trained agents are deployed for distributed control. In the t-th scheduling interval, regional agent (k) employs a policy network to determine a dispatch decision (
3.2.1 Objective Function of Intra-Day Optimization
The intra-day optimization strategy refines the day-ahead grid plan to mitigate challenges associated with energy supply and renewable energy deficits, which arise from forecasting errors. By adjusting MTs and ESSs, this strategy maintains the grid’s supply-demand equilibrium, enhances local renewable energy integration, upholds the grid’s economic efficiency, and ensures sufficient real-time reserve capacity. Intra-day optimization focuses on the economic adjustment target and flexibility capacity constraints during intra-day operations.
The intra-day economic adjustment target primarily comprises the adjustment costs associated with MTs and ESSs to ensure the long-term economic efficiency of grid operations.
The intra-day flexibility capacity constraints primarily pertain to the reserve capacity of ESSs and ensure that ESSs can flexibly mitigate fluctuations in WT and PV power during real-time optimization.
3.2.2 Intra-Day Reserve Capacity Constraint
It is essential to regulate ESS devices during real-time optimization. The reserve capacity constraint of ESS is solely dependent on its upper and lower capacity limits as
3.2.3 Intra-Day Scheduling Model
(1) State space
Intra-day economic optimization primarily focuses on internal regional regulation. The state space (S) for the k-th agent within region k was constructed by selecting the active power output of MTs (
(2) Action space
Since intra-day regulation targets the ESS and distributed generators, its action space (A) was defined as follows:
(3) Reward function
The regional reward value serves as the multi-agent reward, with the reward for the k-th agent (R) defined as
3.3 Real-Time Optimization Scheduling Model
Real-time optimization aims to mitigate power fluctuations caused by wind and solar-distributed energy resources and loads. This is achieved by integrating high-precision, ultra-short-term (5-min) forecast data with source-load forecast errors, enabling effective real-time DN control.
3.3.1 Objective Function of Real-Time Optimization
The effectiveness of power fluctuation mitigation is assessed by evaluating the difference in power deviation within the region before and after regulation.
where
3.3.2 Real-Time Scheduling Model
(1) State space
Real-time economic optimization primarily focuses on regional internal regulation. The state space (S) for the k-th agent within region k was constructed by selecting PV output (
(2) Action space
Since real-time regulation targets the ESS, its action space (A) is defined as follows:
(3) Reward function
The multi-agent reward is based on the regional reward value, with the reward (R) for the k-th agent defined as
In this study, we utilized MATLAB/SIMULINK to simulate a multi-timescale coordinated optimization dispatch model for power systems, accounting for the uncertainty intervals in source-load forecasts. The simulations were performed on a system equipped with an Intel(R) Core(TM) i7-10510U CPU. An improved IEEE 33-bus expanded system (Fig. 7) was used to validate the model for active DNs set at 12.66 kV reference voltage.
Figure 7: Network topology of the improved IEEE 33-node simulation system
The test system was divided into three autonomous regions, as depicted in Fig. 6. The PV units, WT units, MTs, ESSs, and FLs were positioned at nodes {17, 32}, {19, 30}, {10, 24, 28}, {10, 24, 28}, and {33}, respectively. Fig. 8 illustrates the 24-h source-load forecast data and Table 1 provides the operational parameters of the equipment.
Figure 8: 24-h source-load forecasting data
The performance of DRL algorithms is dependent on network architecture and hyperparameters. Network architecture influences the model’s representational capacity and policy approximation accuracy, while hyperparameters significantly influence the model’s learning efficiency and convergence behavior. Furthermore, key parameters like learning rate and discount factor play a crucial role in determining the algorithm’s outcome. Table 2 provides detailed configurations of network architecture and hyperparameters.
4.2 Comparison of Algorithm Convergence and Decision-Making Time
The SAC algorithm is a highly effective DRL method that efficiently balances the exploration-exploitation trade-off by maximizing both the expected reward and entropy of a policy, thus enhancing learning efficiency and performance. In this study, we introduced an improved adaptive temperature coefficient mechanism within the SAC framework, allowing dynamic temperature coefficient adjustments. This modification allows the algorithm to better accommodate environmental uncertainties and task-specific demands, leading to faster convergence and increased stability. We subsequently conducted a comparative analysis of the standard and modified SAC algorithms under identical environmental conditions and determined the rewards accumulated by the agent during the training process (Fig. 9).
Figure 9: Convergence analysis of the SAC algorithm
The agents were trained for a total of 1000 episodes. As illustrated in Fig. 4, the modified SAC algorithm achieved significantly high rewards within 100 episodes, while the standard SAC algorithm reached a similar performance level after approximately 400 episodes. These results demonstrate the superior convergence and learning speed of the modified SAC algorithm compared to the standard SAC algorithm. To further evaluate the decision-making speed of the MASAC algorithm, we integrated the SAC, MASAC, and Particle Swarm Optimization (PSO) algorithms into an intra-day optimization model. The results revealed that the MASAC algorithm exhibited a faster decision-making time than the SAC algorithm and significantly outperformed the PSO algorithm (Table 3).
4.3 Analysis of Scheduling Outcomes
(1) Analysis of day-ahead scheduling outcomes
In the day-ahead stage, we optimized source-load-storage coordination based on the forecast data (Fig. 10), yielding optimal output schedules for various flexibility resources. In this study, we adopted a centralized scheduling approach for day-ahead dispatch.
Figure 10: Graph of optimization results in the last 24 h
In the day-ahead scheduling model, full utilization of wind and solar resources was achieved through the coordinated optimization of source, load, and storage. The proactive interaction of flexible resources reduced the system’s overall operational costs during the intra-day stage. As seen in Fig. 11, the FLs that were originally scheduled between 13:00 and 18:00 were redistributed to 8:00 to 13:00 after optimization, achieving peak shaving and load leveling. As observed in Fig. 12, the output of MTs and ESSs across different regions showed minimal power fluctuations between 10:00 and 20:00, during the day-ahead stage. In Region A, the close proximity to the main bus and heightened sensitivity to electricity prices resulted in a reduction in power output between 2:00 and 4:00, followed by a subsequent increase from 4:00 to 6:00 (as illustrated in Fig. 12a). Conversely, in Region B, the significant share of distributed renewable energy contributed to a comparatively lower total output from MTs and ESSs (shown in Fig. 12b). This discrepancy can likely be explained by the inherent uncertainties tied to renewable energy generation, which may have impacted the overall system performance. However, both regions showed a rapid load increase from 4:00 to 6:00, leading to high load demands, necessitating increased output from the MTs and ESSs (Fig. 12). Subsequently, the total output decreased in response to increasing load uncertainty to preserve flexible capacity. ESS charging was predominantly influenced by electricity prices, especially before 5:00 and after 20:00.
Figure 11: Comparative analysis of load data before and after optimization
Figure 12: Output profiles of MTs and ESSs across different regions
Fig. 13 illustrates the daily flexibility supply and demand of the system. As illustrated in Fig. 13, the flexibility demand consistently exceeded the flexibility supply, with the demand increasing as it approaches the midpoint, followed closely by supply. This may be attributed to the Gaussian distribution of PV, WT, and FL forecast data. The central interval exhibited larger forecast fluctuations compared to the outer intervals. Additionally, since PV power generation was concentrated between 6:00 and 18:00, power fluctuations were more pronounced during this period.
Figure 13: Comparative analysis of flexibility capacity and demand profiles
(2) Analysis of intra-day scheduling outcomes
The intra-day optimization process builds upon the existing day-ahead scheduling strategy by adjusting the operational plans for MTs and ESSs. This ensures the balance between supply and demand within the grid while maximizing renewable energy utilization. Fig. 14 illustrates the variations in distributed MT output between day-ahead and intra-day operations across different regions, highlighting the significant impact of load fluctuations. The turbine output decreased between 9:00 to 13:00, corresponding with a notable decrease in load demand relative to the day-ahead predictions. Conversely, during other periods, the load demand exceeded the day-ahead forecasts, resulting in higher intra-day turbine output. Notably, Region 2, integrating both PV and WT, exhibited greater output volatility compared to the other regions.
Figure 14: Convergence analysis of the algorithm
(3) Analysis of real-time scheduling outcomes
The real-time correction phase involves rapid resource re-optimization, focusing primarily on distributed ESSs. Real-time scheduling optimization relies on intra-day rolling schedules to minimize adjustments and accommodate stochastic supply-demand fluctuations. Fig. 15 illustrates the regulation outcomes of the ESS, showing significant discrepancies between the real-time and day-ahead outcomes, which may be attributed to substantial variations between day-ahead and intra-day forecast data. The real-time phase focuses on mitigating the underutilization of renewable energy caused by supply-demand fluctuations by introducing minor but frequent adjustments to the ESS strategy.
Figure 15: ESS scheduling strategy across multiple timescales
To validate the scalability and adaptability of the proposed MASAC network model in larger DN systems, we constructed a 97-node DN system based on the original 33-node system through topology extension and parameter adjustment [33]. This expanded system retained the fundamental characteristics of the original system while tripling its scale, providing a more challenging test environment for evaluating algorithm performance. During the expansion process, we strictly adhered to the topological connection rules and electrical parameter constraints of the DNs to ensure that the expanded system has rational physical significance and engineering feasibility. Like the 33-node system, the expanded system utilized a partition management strategy and consisted of three interconnected regions, exhibiting similar operational parameters of distributed energy resources (including PV power, WT power, and MTs) and ESSs. This design ensured the comparability of experiments, providing an ideal experimental platform for studying the collaborative control capabilities of multi-agent systems in larger networks. The specific details of the 97-node DN system topology are illustrated in Fig. 16.
Figure 16: Schematic diagram of the 97-node DN system
(1) Analysis of scheduling outcomes
A centralized optimization approach was employed to achieve optimal output schemes for various flexible resources through coordinated optimization of sources, loads, and storage, maximizing the utilization of wind and solar resources. Fig. 17a presents the data for each scheduling resource, and Fig. 17b illustrates the load operation status before and after regulation. The algorithm redistributes some of the FLs, originally concentrated between 13:00 and 18:00, to the 8:00 to 13:00 timeframe to achieve peak shaving and valley filling. As seen in Fig. 17c, the system’s flexibility supply and demand fluctuate throughout the day, with the supply consistently exceeding demand across all time periods. However, flexibility demand increased as it approached midday, aligning more closely with flexibility supply.
Figure 17: Day-ahead optimization outcomes
(2) MT output performance
Fig. 18 illustrates the variations in the output of distributed MTs during day-ahead and intra-day dispatching. Fig. 18a and b, 18c and d, and 18e and f depicts the day-ahead scheduled output and intra-day actual output curves for MTs in Regions 1, 2, and 3, respectively. Based on the day-ahead scheduling outcomes, agents dynamically adjust MT output strategies in response to real-time load demands and fluctuations in WT and PV power generation, ensuring efficient utilization of distributed energy resources. Additionally, the algorithm demonstrates robust adaptability across regions of varying sizes. The results indicate that the proposed multi-agent optimization algorithm applies to both small- and large-scale DN systems, laying a foundation for its future application in more complex DN scenarios.
Figure 18: MT scheduling outcomes
(3) Energy storage scheduling outcomes
Although real-time scheduling involves comparatively smaller adjustments, its higher regulation frequency allows for rapid responses to system fluctuations and efficient integration of renewable energy resources (Fig. 19). This regulatory characteristic was validated using the 97-node DN system, which demonstrated that the proposed multi-agent optimization algorithm possessed excellent scalability and adaptability. Our experimental results further demonstrate that the framework effectively regulates power DNs with high renewable energy integration. Altogether our study revealed that the MASAC model displays remarkable adaptability, robustness, and scalability in multi-agent environments, particularly in complex power distribution systems with an increasing number of nodes and distributed energy resources. These findings validate the model’s reliability and stability during scale expansion and provide theoretical support and practical feasibility for its application in large-scale ESSs.
Figure 19: Comparison of multi-timescale ESS scheduling strategies
(1) Robustness analysis
To comprehensively evaluate the robustness of the proposed multi-timescale optimization scheduling framework, we assessed its performance under varying day-ahead and intra-day source-load forecasting uncertainties. As shown in Tables 4 [34] and 5, this experiment employed identical base data to generate low, medium, and high uncertainty scenarios, with adjusted forecast fluctuation thresholds for WT and PV power generation and FL, corresponding to specific probabilistic distribution models. Our findings validate the effectiveness and reliability of the model for real-world applications.
Table 6 provides an overview of dispatch costs, utilization rates, and capacity/demand ratios under various scenarios. These results reveal that the total system cost was significantly affected by day-ahead and intra-day purchase costs. The costs decreased significantly with a decrease in uncertainty. Specifically, the high uncertainty scenario incurred the highest costs, with total and day-ahead costs exceeding those of the moderate scenario by 3240.2 USD and 3664.8 USD, respectively. Meanwhile, the low uncertainty scenario offered further reductions of 1869.1 USD and 2427.1 USD in total and day-ahead costs, compared to the moderate scenario.
The high uncertainty scenario faced various challenges, such as increased external electricity purchases and scheduling difficulties, which were attributed to insufficient flexibility of ESSs and MTs. This was reflected in a lower capacity/demand ratio of 0.9, indicating underutilization of resources. In contrast, the moderate uncertainty scenario achieved a 100% resource utilization rate and a 1.3 capacity/demand ratio, suggesting a more balanced allocation of resources, resulting in significant cost savings and stability. Lastly, the low uncertainty scenario showed further optimization of resource utilization and efficiency. However, as low uncertainty conditions are less common in real-world settings, the moderate uncertainty strategy was deemed to be more practical and robust for real-world applications. In conclusion, the proposed multi-timescale optimization dispatch framework exhibited robust performance across varying levels of uncertainty. While high uncertainty scenarios pose cost and flexibility challenges, moderate uncertainty scenarios effectively maintain cost, resource utilization, and system stability, demonstrating high adaptability and efficiency of the framework in managing source-load forecast uncertainties.
(2) Comparison of results
Considering the source-load uncertainty range, the proposed method can better calculate the flexibility demand capacity, ensuring a stable power supply. To more effectively evaluate the significance of the proposed method, three sets of multi-time-scale scheduling methods are used as control groups: Method 1 disregards the source-load uncertainty range and uses deterministic optimization; Method 2 employs robust optimization to account for the worst-case scenario in the day-ahead optimization, while intra-day and real-time adjustments are made using a rolling optimization approach [34]. Method 3 utilizes the Copula function in the day-ahead stage to generate scenarios that describe the uncertainty in source-load prediction data, with intra-day and real-time rolling adjustments [35]. Method 4 is the method proposed in this paper. During the comparison, we calculated the grid cost, renewable energy absorption, and flexibility capacity for each method, as shown in Table 7.
The results show that Method 4 (that is, the method in this paper) has the best performance on the day and day combined cost. Compared with Method 1, which has the highest cost, the cost is reduced by $364.89, or about 11.95%; In terms of the combined cost of daily grid electricity, Method 4 led the way at $2406.97, a decrease of $114.19 (4.53%) compared to Method 2 and $98 (3.91%) compared to Method 3. In terms of the absorption of new energy, Methods 2, 3, and 4 all achieve a 100% absorption rate, while Method 1 is only 87%, which shows that the method in this paper is more effective in using renewable energy. Especially in terms of capacity requirements, Method 4 requires a smaller flexible demand capacity (1.3), 13.3% lower than Method 2, while maintaining power system stability. Overall, the experimental results prove that the proposed method can effectively reduce the comprehensive cost by about 12% under medium uncertainty conditions, increase the energy absorption rate to 100%, and ensure a stable power supply with a low flexible capacity of 1.3, showing significant application potential and practical value in multi-timescale scheduling.
5 Conclusions and Future Perspectives
This study proposes a multi-timescale DN optimization scheduling method that considers uncertainty intervals, which is solved using an improved Soft Actor-Critic algorithm and a Multi-Agent SAC algorithm. This approach effectively addresses the challenges posed by source-load prediction uncertainties. The main contributions are as follows:
(1) Uncertainty Interval Optimization Method: By integrating uncertainty intervals of source-load predictions, this method resolves the flexibility requirements and capacity constraints in power grid scheduling, significantly enhancing the system’s economic efficiency and renewable energy accommodation capability.
(2) Improved SAC Algorithm: Through adaptive adjustment of the temperature coefficient, the exploration capability of the agent is enhanced, making the training process more stable while reducing policy fluctuations.
(3) Multi-Agent Real-Time Optimization: The multi-agent model based on the CTDE framework significantly reduces the computation time for intra-day and real-time strategy generation, ensuring the real-time performance of scheduling strategies.
When applying the methodology presented in this paper to the optimization scheduling of real-world DNs, we encounter a range of complex challenges. Firstly, the training of agents is highly dependent on a substantial amount of authentic DN simulation data, which is often difficult to obtain due to confidentiality and security requirements. Although our improved SAC algorithm partially mitigates the dependency on a fixed DN framework, the need for high-quality training data remains a critical issue. To address these challenges, future advancements must focus on the development of innovative data generation and simulation techniques, as well as enhancing the algorithm’s adaptability under incomplete data conditions. Furthermore, interdisciplinary collaboration and the development of new technological tools will play a crucial role in facilitating the practical application of these techniques within DN systems. These efforts will ultimately contribute to improving the operational efficiency of power distribution systems and accelerating the integration and application of renewable energy sources.
Acknowledgement: Not applicable.
Funding Statement: This research was funded by Jilin Province Science and Technology Development Plan Project, grant number 20220203163SF.
Author Contributions: The authors confirm their contribution to the paper as follows: study conception and design: Huanan Yu, Shiqiang Li; data collection: Jinling Li, Jing Bian; analysis and interpretation of results: Chunhe Ye, Shiqiang Li, He Wang, Huanan Yu; draft manuscript preparation: Chunhe Ye, Shiqiang Li, Huanan Yu. All authors reviewed the results and approved the final version of the manuscript.
Availability of Data and Materials: The authors confirm that the data supporting the findings of this study are available within the article. And the additional data that support the findings of this study are available on request from the corresponding author, upon reasonable request.
Ethics Approval: This study did not involve any human or animal subjects.
Conflicts of Interest: The authors declare no conflicts of interest to report regarding the present study.
Abbreviations
DNs | Distribution networks |
PV | Photovoltaic |
WT | Wind turbine |
MT | Micro gas turbine |
ESS | Energy storage system |
DRL | Deep reinforcement learning |
FL | Flexible load |
NL | Net load |
SAC | Soft actor-critic |
SOC | State of charge |
Indices | |
Index of distributed generation units | |
Index of dispatch periods | |
X | Index of constraints |
T | The number of control periods within a regulation cycle |
Parameters | |
The reward discount factor | |
The hyperparameter for the soft update of the target network in the control system | |
The unit electricity price | |
Distributed generator electricity price | |
Loss cost coefficient | |
ESS regulation cost coefficient | |
Flexible load regulation cost coefficient | |
The charging efficiency of the ESS | |
The discharging efficiency of the ESS | |
The self-discharge rate of the ESS | |
The total capacity of the ESS | |
The total number of FLs | |
The lower bounds of the constraints for Project X | |
The upper bounds of the constraints for Project X | |
The minimum voltage allowed by the system | |
The maximum voltage allowed by the system | |
The maximum and minimum thresholds for active power output | |
The maximum and minimum thresholds for active power output | |
The maximum upward ramp rates for MT | |
The maximum downward ramp rates for MT | |
The maximum reactive power output limits for MT | |
The minimum reactive power output limits for MT | |
The upper capacity limits of ESS | |
The lower capacity limits of ESS | |
The maximum permissible charging and for BESS | |
The maximum permissible discharging power for BESS | |
The penalty coefficient for project X | |
The intraday adjusted output price for MT | |
The intraday adjusted output price for ESS | |
Variables | |
The discrepancy between the predicted values and the actual values | |
The random variable of probability box theory | |
The unit electricity price of time t | |
The forecast or actual output of a PV power station | |
The forecast or actual output of a WT power station | |
The forecast or actual demand power of load | |
The NF forecast or actual net demand power | |
The upper bound of the cumulative probability distribution function | |
The lower bound of the cumulative probability distribution function | |
The upper boundaries of the flexibility demand range | |
The lower boundaries of the flexibility demand range | |
Upper bound of the flexibility demand range under NF | |
Lower bound of the flexibility demand range under NF | |
Upper bound of the grid flexibility capacity range | |
Lower bound of the grid flexibility capacity rang | |
The temperature coefficient at time t | |
The temperature coefficient after updating at time t | |
The parameters of the Actor network | |
The parameters of critic network i | |
The replay buffer for updating the sample set | |
The exchange power between the distribution network and the upstream grid | |
The discharge power of MT | |
Active power losses in the distribution network | |
Charging power of the energy storage | |
Discharging power of the energy storage | |
The active power injected into node i at time t | |
The reactive power injected into node i at time t | |
Original power of the i-th type of FL | |
The power after regulation of FLs at time t | |
The reactive power output of MT | |
The variation in MT output | |
The variation in the ESS | |
ESS capacity at time t | |
The micro gas turbine adjusts its output based on the current control actions at time t | |
The ESS adjusts its output according to the current control actions at time t | |
The post-regulation regional power imbalance |
References
1. Singh S, Singh S. Advancements and challenges in integrating renewable energy sources into distribution grid systems: a comprehensive review. J Energy Resour Technol. 2024;146(9):090801. doi:10.1115/1.4065503. [Google Scholar] [CrossRef]
2. Tao Y, Qiu J, Lai S, Zhao J, Xue Y. Carbon-oriented electricity network planning and transformation. IEEE Trans Power Syst. 2021;36(2):1034–48. doi:10.1109/TPWRS.2020.3016668. [Google Scholar] [CrossRef]
3. Fang Y, Han J, Du E, Jiang H, Fang Y, Zhang N, et al. Electric energy system planning considering chronological renewable generation variability and uncertainty. Appl Energy. 2024;373(2):123961. doi:10.1016/j.apenergy.2024.123961. [Google Scholar] [CrossRef]
4. Gbadega PA, Sun YX. JAYA algorithm-based energy management for a grid-connected micro-grid with PV-wind-microturbine-storage energy system. Int J Eng Res Afr. 2023;63(1):159–84. doi:10.4028/p-du1983. [Google Scholar] [CrossRef]
5. Su J, Anokhin D, Dehghanian P, Lejeune MA. On the use of mobile power sources in distribution networks under endogenous uncertainty. IEEE Trans Control Netw Syst. 2023;10(4):1937–49. doi:10.1109/TCNS.2023.3256278. [Google Scholar] [CrossRef]
6. Chowdhury MMUT, Biswas BD, Kamalasadan S. Second-order cone programming (SOCP) model for three phase optimal power flow (OPF) in active distribution networks. IEEE Trans Smart Grid. 2023;14(5):3732–43. doi:10.1109/TSG.2023.3241216. [Google Scholar] [CrossRef]
7. Kiani S, Sheshyekani K, Dagdougui H. ADMM-based hierarchical single-loop framework for EV charging scheduling considering power flow constraints. IEEE Trans Transp Electrif. 2023;10(1):1089–1100. doi:10.1109/TTE.2023.3269050. [Google Scholar] [CrossRef]
8. Li S, Bao G, Zhang A. Consensus-based distributed coordinated operation of active distribution networks with electric heating loads. Int J Electr Power Energy Syst. 2023;153(4):109393. doi:10.1016/j.ijepes.2023.109393. [Google Scholar] [CrossRef]
9. Ruan H, Gao H, Liu Y, Wang L, Liu J. Distributed voltage control in active distribution network considering renewable energy: a novel network partitioning method. IEEE Trans Power Syst. 2020;35(6):4220–31. doi:10.1109/TPWRS.2020.3000984. [Google Scholar] [CrossRef]
10. Wu Z, Li Y, Gu W, Dong Z, Zhao J, Liu W, et al. Multi-timescale voltage control for distribution system based on multi-agent deep reinforcement learning. Int J Electr Power Energy Syst. 2023;147(7):108830. doi:10.1016/j.ijepes.2022.108830. [Google Scholar] [CrossRef]
11. Dong G, Chen Z. Data-driven energy management in a home microgrid based on Bayesian optimal algorithm. IEEE Trans Ind Inform. 2018;15(2):869–77. doi:10.1109/TII.2018.2820421. [Google Scholar] [CrossRef]
12. Park K, Moon I. Multi-agent deep reinforcement learning approach for EV charging scheduling in a smart grid. Appl Energy. 2022;328(1):120111. doi:10.1016/j.apenergy.2022.120111. [Google Scholar] [CrossRef]
13. Wang G, Sun Y, Li J, Jiang Y, Li C, Yu H, et al. Dynamic economic scheduling with self-adaptive uncertainty in distribution network based on deep reinforcement learning. Energy Eng. 2024;121(6):1671–95. doi:10.32604/ee.2024.047794. [Google Scholar] [CrossRef]
14. Li P, Wei M, Ji H, Xi W, Yu H, Wu J, et al. Deep reinforcement learning-based adaptive voltage control of active distribution networks with multi-terminal soft open point. Int J Electr Power Energy Syst. 2022;141(2):108138. doi:10.1016/j.ijepes.2022.108138. [Google Scholar] [CrossRef]
15. Lyu X, Baisero A, Hao Y, Daley B, Amato C. On centralized critics in multi-agent reinforcement learning. J Artif Intell Res. 2023;77:295–354. doi:10.1613/jair.1.14386. [Google Scholar] [CrossRef]
16. Li L, Wang J, Li W, Peng Q, Chen X, Li S. Decentralized decision for multi-band sensing: a deep reinforcement learning approach. IEEE Wirel Commun Lett. 2021;10(12):2674–7. doi:10.1109/LWC.2021.3111750. [Google Scholar] [CrossRef]
17. Cao D, Zhao J, Hu W, Yu N, Ding F, Huang Q, et al. Deep reinforcement learning enabled physical-model-free two-timescale voltage control method for active distribution systems. IEEE Trans Smart Grid. 2022;13(1):149–65. doi:10.1109/TSG.2021.3113085. [Google Scholar] [CrossRef]
18. Rayati M, Bozorg M, Carpita M, Cherkaoui R. Stochastic optimization and Markov chain-based scenario generation for exploiting the underlying flexibilities of an active distribution network. Sustainable Energy Grids Networks. 2023;34(2):100999. doi:10.1016/j.segan.2023.100999. [Google Scholar] [CrossRef]
19. Al-Lawati RA, Faiz TI, Noor-E-Alam M. A nationwide multi-location multi-resource stochastic programming based energy planning framework. Energy. 2024;295:130898. doi:10.1016/j.energy.2024.130898. [Google Scholar] [CrossRef]
20. Mohseni S, Pishvaee MS. Energy trading and scheduling in networked microgrids using fuzzy bargaining game theory and distributionally robust optimization. Appl Energy. 2023;350:121748. doi:10.1016/j.apenergy.2023.121748. [Google Scholar] [CrossRef]
21. Nammouchi A, Aupke P, D’Andreagiovanni F, Ghazzai H, Theocharis A, Kassler A. Robust opportunistic optimal energy management of a mixed microgrid under asymmetrical uncertainties. Sustainable Energy Grids Networks. 2023;36(4):101184. doi:10.1016/j.segan.2023.101184. [Google Scholar] [CrossRef]
22. Li C, Li Y, Peng S, Wang P, Ge Q, Song L, et al. A two-stage adaptive-robust optimization model for active distribution network with high penetration wind power generation. IET Renew Power Gener. 2024;18(7):1204–17. doi:10.1049/rpg2.12836. [Google Scholar] [CrossRef]
23. Esteban-Pérez A, Morales JM. Distributionally robust optimal power flow with contextual information. Eur J Oper Res. 2023;306(3):1047–58. doi:10.1016/j.ejor.2022.10.024. [Google Scholar] [CrossRef]
24. Jiang C, Zhang QF, Han X, Li D, Liu J. An interval optimization method considering the dependence between uncertain parameters. Comp Model Eng Sci. 2011;74(1):65–82. doi:10.3970/cmes.2011.074.065. [Google Scholar] [CrossRef]
25. Xu H, Chang Y, Zhao Y, Wang F. A new multi-timescale optimal scheduling model considering wind power uncertainty and demand response. Int J Electr Power Energy Syst. 2023;147(2):108832. doi:10.1016/j.ijepes.2022.108832. [Google Scholar] [CrossRef]
26. Li Q, Zhao N. A probability box representation method for power flow analysis considering both interval and probabilistic uncertainties. Int J Electr Power Energy Syst. 2022;142(1):108371. doi:10.1016/j.ijepes.2022.108371. [Google Scholar] [CrossRef]
27. Tan Q, Mei S, Dai M, Zhou L, Wei Y, Ju L. A multi-objective optimization dispatching and adaptability analysis model for wind-PV-thermal-coordinated operations considering comprehensive forecasting error distribution. J Clean Prod. 2020;256(3):120407. doi:10.1016/j.jclepro.2020.120407. [Google Scholar] [CrossRef]
28. Wang MQ, Gooi HB. Spinning reserve estimation in microgrids. IEEE Trans Power Syst. 2011;26(3):1164–74. doi:10.1109/TPWRS.2010.2100414. [Google Scholar] [CrossRef]
29. Jin J, Xu Y. Optimal policy characterization enhanced actor-critic approach for electric vehicle charging scheduling in a power distribution network. IEEE Trans Smart Grid. 2020;12(2):1416–28. doi:10.1109/TSG.2020.3028470. [Google Scholar] [CrossRef]
30. Zhao Z, Zhang Y, Wang S, Zhang F, Zhang M, Chen W. QDAP: downsizing adaptive policy for cooperative multi-agent reinforcement learning. Knowl Based Syst. 2024;294(1):111719. doi:10.1016/j.knosys.2024.111719. [Google Scholar] [CrossRef]
31. Deng L, Huan J, Wang W, Zhang W, Xie L, Dong L, et al. Market operation of energy storage system in smart grid: a review. Energy Eng. 2024;121(6):1403–37. doi:10.32604/ee.2024.046393. [Google Scholar] [CrossRef]
32. Qiu S, Deng Y, Ding M, Han W. An optimal scheduling method for distribution network clusters considering source-load–storage synergy. Sustainability. 2024;16(15):6399. doi:10.3390/su16156399. [Google Scholar] [CrossRef]
33. Lu Y, Xiang Y, Huang Y, Yu B, Weng L, Liu J. Deep reinforcement learning based optimal scheduling of active distribution system considering distributed generation, energy storage and flexible load. Energy. 2023;271(4):127087. doi:10.1016/j.energy.2023.127087. [Google Scholar] [CrossRef]
34. Hou J, Yu W, Xu Z, Ge Q, Li Z, Meng Y. Multi-time scale optimization scheduling of microgrid considering source and load uncertainty. Elect Power Syst Res. 2023;216(32):109037. doi:10.1016/j.epsr.2022.109037. [Google Scholar] [CrossRef]
35. Shan X, Xue F. A day-ahead economic dispatch scheme for transmission system with high penetration of renewable energy. IEEE Access. 2022;10(12):11159–72. doi:10.1109/ACCESS.2022.3145973. [Google Scholar] [CrossRef]
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.