Home / Journals / ENERGY / Online First / doi:10.32604/ee.2025.074454
Special Issues

Open Access

ARTICLE

Intelligent Operation Strategies for PVT-ASHP Heating and Hot Water Systems in Industrial Parks Based on Reinforcement Learning

Yingjie Su1, Yubin Qiu2, Zhuojun Dong1, Jiying Liu2,*, Bo Gao1,3,*
1 Building Energy and Environment Research Institute, Sichuan Institute of Building Research, Chengdu, 610084, China
2 School of Thermal Engineering, Shandong Jianzhu University, Jinan, 250101, China
3 School of Mechanical Engineering, Southwest Jiaotong University, Chengdu, 610030, China
* Corresponding Author: Jiying Liu. Email: email; Bo Gao. Email: email

Energy Engineering https://doi.org/10.32604/ee.2025.074454

Received 11 October 2025; Accepted 17 December 2025; Published online 04 January 2026

Abstract

In response to the high energy consumption, large load fluctuations, and insufficient adaptability associated with conventional control strategies in industrial park heating and hot water systems, this paper studies a 15,000 m2 factory office building in Jinan as its object of study. A photovoltaic-thermal integrated air-source heat pump system (PVT-ASHP) is developed. This system leverages its hardware parameter co-optimization and intelligent operational strategy control to perform cost reduction and efficiency increase, while focusing on the novel innovative high effectiveness of its operational strategies. The study first employs the Hooke-Jeeves algorithm to optimize key hardware parameters so as to minimize the annual cost, perform many adjustments, including the reduction of the PVT collector area from 931 to 799 m2, regulate the PVT tilt angle from 36° to 43°, and modify the storage tank volume. This allows for the establishment of a low-energy baseline, reducing the initial PVT equipment investment by approximately 14.2%. In addition, the PVT photovoltaic efficiency is stabilized at 14%, while the solar thermal efficiency fluctuates around 33%. The core operational strategy uses a reinforcement learning algorithm based on Deep Q-Network (DQN). Its design incorporates dual variables PVT electricity generation (PVTd) and PVT heat supply (PVTh) into the state space, which overcomes the limitations of conventional control relying solely on load and outdoor temperature to perform dynamic matching between energy production and load demand. The reward function comprises dynamic weighting for energy consumption and comfort, where the energy consumption weight and comfort weight are set to 0.9 and 0.1, respectively. Based on the office hours of the factory (8:00–18:00 as high load, and non-office hours as low load), an hourly load input mechanism is designed to remove the control deviations caused by the average load assumption. Simulations are then conducted. The obtained results demonstrate that, compared with the conventional control strategy of fixed temperature at 60°C, the designed DQN reinforcement learning operation strategy achieves energy savings of about 2.99%. During office hours, the system maintains a stable supply water temperature of 57°C, which is consistent with comfort requirements while avoiding energy waste. After performing parameter optimization using the operational control strategy, the annual operating costs of the system decrease by 9.43%, while significantly increasing the overall energy efficiency. This paper demonstrates that the proposed DQN reinforcement learning operation strategy, tailored to the load characteristics of factory campuses, plays an important role in improving system performance. Based on the principles of dynamic perception, precise matching, and demand-driven regulation, it provides a potential reference framework for designing similar systems to ensure the efficient operation of distributed energy systems in factory campus-type buildings.

Keywords

Transient system simulation program; Deep Q-Network; Hooke-Jeeves algorithm; generic optimization program; heat supply
  • 312

    View

  • 108

    Download

  • 0

    Like

Share Link