Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.072505
Special Issues
Table of Content

Open Access

ARTICLE

Heterogeneous Computing Power Scheduling Method Based on Distributed Deep Reinforcement Learning in Cloud-Edge-End Environments

Jinwei Mao1,2, Wang Luo1,2,*, Jiangtao Xu3, Daohua Zhu3, Wei Liang3, Zhechen Huang3, Bao Feng1,2, Shuang Yang1,2
1 State Grid Electric Power Research Institute Co., Ltd., Nanjing, China
2 Nanjing Nari Information & Communication Technology Co., Ltd., Nanjing, China
3 State Grid Jiangsu Electric Power Co., Ltd. Research Institute, Nanjing, China
* Corresponding Author: Wang Luo. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.072505

Received 28 August 2025; Accepted 15 December 2025; Published online 18 February 2026

Abstract

With the rapid development of power Internet of Things (IoT) scenarios such as smart factories and smart homes, numerous intelligent terminal devices and real-time interactive applications impose higher demands on computing latency and resource supply efficiency. Multi-access edge computing technology deploys cloud computing capabilities at the network edge; constructs distributed computing nodes and multi-access systems and offers infrastructure support for services with low latency and high reliability. Existing research relies on a strong assumption that the environmental state is fully observable and fails to thoroughly consider the continuous time-varying features of edge server load fluctuations, leading to insufficient adaptability of the model in a heterogeneous dynamic environment. Thus, this paper establishes a framework for end-edge collaborative task offloading based on a partially observable Markov decision-making process (POMDP) and proposes a method for end-edge collaborative task offloading in heterogeneous scenarios. It achieves time-series modeling of the historical load characteristics of edge servers and endows the agent with the ability to be aware of the load in dynamic environmental states. Moreover, by dynamically assessing the exploration value of historical trajectories in the central trajectory pool and adjusting the sample weight distribution, directional exploration and strategy optimization of high-value trajectories are realized. Experimental results indicate that the proposed method exhibits distinct advantages compared with existing methods in terms of average delay and task failure rate and also verifies the method’s robustness in a dynamic environment.

Keywords

Edge computing; end-edge collaboration; heterogeneous computing power scheduling; resource allocation
  • 88

    View

  • 18

    Download

  • 0

    Like

Share Link