Open Access
ARTICLE
Enhanced Practical Byzantine Fault Tolerance for Service Function Chain Deployment: Advancing Big Data Intelligence in Control Systems
1 Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
2 Shandong Key Laboratory of Intelligent Oil & Gas Industrial Software, Qingdao, 266580, China
3 Library of Shanghai Lixin University of Accounting and Finance, Shanghai, 201209, China
4 Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250014, China
5 Shandong Provincial Key Laboratory of Computing Power Internet and Service Computing, Shandong Fundamental Research Center for Computer Science, Jinan, 250014, China
6 Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China
7 Key Laboratory of Intelligent Game, Yangtze River Delta Research Institute of NPU, Taicang, 215400, China
8 Key Laboratory of Education Informatization for Nationalities (Yunnan Normal University), Ministry of Education, Kunming, 650092, China
* Corresponding Author: Peiying Zhang. Email:
(This article belongs to the Special Issue: Big Data and Artificial Intelligence in Control and Information System)
Computers, Materials & Continua 2025, 83(3), 4393-4409. https://doi.org/10.32604/cmc.2025.064654
Received 20 February 2025; Accepted 03 April 2025; Issue published 19 May 2025
Abstract
As Internet of Things (IoT) technologies continue to evolve at an unprecedented pace, intelligent big data control and information systems have become critical enablers for organizational digital transformation, facilitating data-driven decision making, fostering innovation ecosystems, and maintaining operational stability. In this study, we propose an advanced deployment algorithm for Service Function Chaining (SFC) that leverages an enhanced Practical Byzantine Fault Tolerance (PBFT) mechanism. The main goal is to tackle the issues of security and resource efficiency in SFC implementation across diverse network settings. By integrating blockchain technology and Deep Reinforcement Learning (DRL), our algorithm not only optimizes resource utilization and quality of service but also ensures robust security during SFC deployment. Specifically, the enhanced PBFT consensus mechanism (VRPBFT) significantly reduces consensus latency and improves Byzantine node detection through the introduction of a Verifiable Random Function (VRF) and a node reputation grading model. Experimental results demonstrate that compared to traditional PBFT, the proposed VRPBFT algorithm reduces consensus latency by approximately 30% and decreases the proportion of Byzantine nodes by 40% after 100 rounds of consensus. Furthermore, the DRL-based SFC deployment algorithm (SDRL) exhibits rapid convergence during training, with improvements in long-term average revenue, request acceptance rate, and revenue/cost ratio of 17%, 14.49%, and 20.35%, respectively, over existing algorithms. Additionally, the CPU resource utilization of the SDRL algorithm reaches up to 42%, which is 27.96% higher than other algorithms. These findings indicate that the proposed algorithm substantially enhances resource utilization efficiency, service quality, and security in SFC deployment.Keywords
With the swift evolution of Internet of Things (IoT) technology, device interconnectivity within heterogeneous network environments has markedly expanded. This expansion has facilitated diverse application scenarios such as intelligent homes and autonomous driving systems. Nevertheless, this highly interconnected environment also presents numerous security challenges [1], including breaches in data privacy, issues related to delegated trust, and risks during the deployment of Service Function Chaining (SFC) [2]. SFC is a technique that combines multiple network functions [3] in a specific sequence to cater to the unique requirements of various users. It offers tailored network services like traffic monitoring [4], firewall protection, and intrusion detection by organizing a series of virtual network functions (VNFs) [5] in a defined order. In 5G and IoT settings, SFCs allow for flexible configuration of network functions, enhancing efficient data processing and transmission [6]. However, dynamic challenges such as node failures, link congestion, and network attacks threaten the reliability of SFCs. Node failures can disrupt SFC operations, while malicious actors may tamper with requests, simulate nodes, or initiate denial-of-service (DoS) attacks [7], thereby jeopardizing network security, communication quality, and user experience.
Choosing optimal paths and nodes for deploying SFCs in intricate network topologies to meet quality of service (QoS) criteria, such as minimal latency, substantial bandwidth [8], and high reliability [9], constitutes a complex optimization challenge.
In response to these challenges, the present study introduces a trusted deployment algorithm for service function chaining based on an enhanced PBFT mechanism (VRPBFT) [10], aiming to optimize the traditional PBFT consensus mechanism by combining verifiable random function (VRF) and node reputation rating model. VRF is a cryptographic function that allows a node to generate a random number and verify it through its public key [11]. In VRPBFT, VRF is used for random number generation and verification. The algorithm integrates blockchain and deep reinforcement learning to construct a trustworthy SFC orchestration system. First, VRPBFT is designed, incorporating VRF and a node reputation model to reduce consensus latency and improve Byzantine node detection [12]. Second, a DRL-based SFC deployment algorithm (SDRL) optimizes deployment by dynamically adjusting node trustworthiness. The improved PBFT ensures consensus despite node failures or malicious behavior, enhancing system reliability. Experimental results show significant improvements in resource utilization, service quality, and security, outperforming existing algorithms in long-term revenue, request acceptance rate, revenue/cost ratio, and CPU utilization, offering an efficient and secure SFC deployment solution for heterogeneous networks [13].
This paper establishes a trusted network system centered on users, featuring heterogeneous physical nodes, blockchain, and deep reinforcement learning (DRL). DRL is a robust machine learning methodology that acquires decision-making capabilities through interaction with its environment, rendering it particularly suitable for dynamic and uncertain security contexts. DRL can enhance the performance of intrusion detection systems by facilitating adaptive learning, enabling dynamic decision-making, and minimizing false positives and false negatives. Blockchain and DRL form the system’s core, with blockchain managing resource and transaction registration and updates during deployment [14]. The trusted network system is shown in Fig. 1.
Figure 1: Trusted network system
Before service provisioning, information about all physical nodes is registered on the blockchain [15], ensuring secure VNF node deployment through secure data storage.
During SFC deployment, the blockchain module authenticates users and checks permissions while monitoring the resource and location data of physical nodes [16]. This information is passed to the DRL optimization module to determine the best SFC deployment strategy [17]. The optimization steps are shown in Fig. 2.
Figure 2: Working principle of trusted network layer
The physical network model of the heterogeneous network is depicted as a weighted undirected graph
The operational probability P of physical nodes and links adheres to a uniform distribution within the interval
Service function chain deployment needs to follow certain constraints, including resource constraints for nodes and links and security level constraints. The specific constraint formulas are as follows:
where
2.3 Algorithm Evaluation Metrics
The evaluation metrics comprise consensus delay, long-term average revenue, long-term average revenue-to-cost ratio, service function chain (SFC) request acceptance rate, and CPU resource utilization [18]. The consensus delay is calculated by the formula:
Evaluation metrics encompass consensus delay, long-term average revenue, long-term average revenue-to-cost ratio, SFC request acceptance rate, and CPU resource utilization [18]. Consensus delay is computed using the following formula:
Benefits and costs are calculated as follows:
The long-term average benefit-cost ratio is calculated by the formula:
3 Service Function Chain Trusted Deployment Algorithm
In order to optimize the security of heterogeneous networks, this paper introduces the Byzantine consensus mechanism (PBFT) of blockchain, and combines the verifiable random function (VRF) and the reputation hierarchy model with PBFT to design a more efficient and reliable consensus mechanism, VRPBFT. By integrating the reputation values of the nodes, the dynamic trustworthiness of the nodes is taken into account in extracting the attributes of the physical network.
3.1 Definition of Node Trustworthiness
The reputation of a node, denoted as
where
A node’s credibility is a comprehensive evaluation of its state, performance, and dynamic behavior, determining its reliability and aiding network cooperation and decision-making. Nodes with high investment costs and communication success rates are highly credible, while those with low input costs or honesty rates are less credible. The node credibility evaluation algorithm is detailed in Algorithm 1.
3.2 Node Hierarchy and Transfer
The nodes are categorized into A, B, C and D by the node partitioning mechanism as shown in Fig. 3. Level A nodes, with the highest reputation value (
Figure 3: Schematic diagram of node classification
The classification of each node changes dynamically. Both class A and B nodes are eligible to serve as master nodes, whereas class C nodes can only participate in the consensus process. Class D nodes cannot participate in the consensus due to their low reputation value. This dynamically changing hierarchy helps to ensure secure, stable and efficient operation of the network. The consensus node set consists of three types of nodes, A, B, and C. The master node set, i.e., the honest node group. The master node set, i.e., the honest node group, contains two types of nodes, A and B. The authority of nodes of different classes is shown in Table 2.
The nodes in class A and B are entitled to participate in the selection of master nodes to form an honest node cluster. The master node selection process comprises the following steps:
• Generation of VRF key pairs: each honest node generates a pair of VRF keys.
• Public Key Distribution: Nodes share their VRF public key for random number verification, keeping the private key secure.
• Random Number Generation: Nodes generate a random number using the VRF private key and current timestamp.
• Random Number Broadcast: Nodes broadcast the random number, VRF public key, and timestamp to the network.
• Random Number Verification: Other nodes validate the random number using the broadcasted VRF public key and timestamp.
• Random Number Sorting: The network collects and sorts all random numbers by size.
• Master Node Selection: The node with the smallest random number is designated as the master node. In case of a tie, the node with the highest reputation value is chosen.
The training procedure for the VRPBFT-based SFC deployment algorithm (SDRL) proposed in this paper is as shown in Algorithm 2.
In this study, the learning agent is responsible for determining the deployment strategy for each virtual network function (VNF). After completing the deployment of all VNFs, link deployment is realized through the shortest path algorithm [19]. This algorithm increases the redundancy and fault tolerance of the system and improves the stability of the virtual network by mining multiple shortest paths from the underlying physical network to meet the link connectivity requirements. The available resources in the underlying physical network are updated after the links are successfully deployed [20]. Algorithm 3 describes the specific steps of link deployment in detail.
4 Experiments and Analysis of Results
The proposed algorithm’s effectiveness is demonstrated through comprehensive simulation analysis. Experimental findings reveal that the enhanced PBFT consensus mechanism achieves notable improvements in both consensus delay and security, effectively addressing Byzantine nodes and reducing the proportion of faulty nodes within the network. Moreover, the SDRL algorithm [21] outperforms baseline methods, with enhancements in long-term average revenue, request acceptance rate, revenue-to-cost ratio, and CPU resource utilization by 17%, 14.49%, 13.8%, and 27.96%, respectively. These results confirm that the proposed algorithm not only strengthens security but also optimizes network resource utilization efficiency, ensuring reliable and efficient SFC deployment under dynamic load conditions.
4.1 Experimental Environment Setup
A heterogeneous network topology comprising 100 physical nodes and 530 physical links was generated using the NetworkX tool [22] to simulate medium-sized ISP resources. Physical nodes are classified into X, Y, and Z categories, differing in CPU resources, memory resources, energy resources, bandwidth resources, and security levels. Concurrently, 2000 service function chain requests were generated, with 1000 allocated for training and 1000 for testing. The blockchain network was implemented via Hyperledger Fabric. Specific settings are outlined in Table 3.
Additionally, two sets of 1000 Service Function Chain Requests (SFCRs) were generated using the same setup and utilized as training and test datasets. The number of nodes per SFCR is uniformly distributed between 2 and 10, and the probability of each node being directly connected to another node is 0.4. The arrival time of SFCRs follows a Poisson distribution, with an average of 4 SFCRs arriving every 100 time units. The demand for CPU resources in VNFs is uniformly distributed between 3 and 50 cores, and the demand for security level is distributed across [A, B, C, D]. Security level requirements are distributed between[A, B, C] and [D]. Attribute settings for service function chain requests are presented in Table 4.
The VRPBFT algorithm exhibits reduced consensus latency when the number of consensus nodes remains constant, with this benefit becoming more significant as the number of nodes grows. Moreover, the algorithm excels in identifying and eliminating Byzantine nodes. After approximately 100 rounds of consensus, the proportion of faulty nodes within the network is substantially diminished, thereby enhancing the system’s security.
As shown in Fig. 4, the VRPBFT algorithm exhibits lower consensus latency than traditional methods, with its efficiency advantage growing as the number of consensus nodes increases. This improvement stems from VRPBFT reducing the nodes involved in consensus, thereby decreasing overall delay. Additionally, VRPBFT significantly reduces Byzantine nodes after about 100 consensus rounds, outperforming PBFT in node selection and fault detection. By integrating VRF and node classification, VRPBFT enhances Byzantine node detection, isolation, and overall system security.
Figure 4: Consensus latency
Similar to long-term average gains, the SDRL algorithm [21] achieves optimal performance in request acceptance rate, as shown in Figs. 4 and 5, outperforming other algorithms by up to 14.49%. However, physical network resources limit the number of service function chain requests that can be supported.
Figure 5: Comparison of Byzantine node counts
During training, the policy network’s parameters undergo optimization starting from random initialization, as depicted in Fig. 6. This figure tracks four critical metrics for the SDRL algorithm: long-term average gain, gain-to-cost ratio, request acceptance rate, and CPU resource utilization [23]. These metrics increase over training iterations before eventually stabilizing [24]. In the early stages of training, agent performance is generally suboptimal due to the randomness of initial parameters [25]. However, performance improves progressively as training advances, reflecting enhanced algorithm effectiveness. Despite this, inherent limitations exist in performance improvement due to the algorithm’s design. The SDRL algorithm demonstrates rapid convergence, improving the quality of SFC deployment [26]. To further validate the algorithm’s efficacy, this chapter introduces the RL algorithm from [27] and the SA-RL algorithm from [28] as comparison benchmarks. The RL algorithm provides a decision-making framework, particularly suited for resource allocation and network optimization challenges. The SA-RL algorithm addresses both resource optimization and network security, aligning closely with the objectives of the SDRL algorithm proposed in this paper. Fluctuations in performance metrics during the training process of Deep Reinforcement Learning (DRL) are common. Firstly, the instability observed may stem from the algorithm’s sensitivity to initial parameters, leading to an imbalance between exploration and exploitation. Secondly, insufficient training iterations can prevent the algorithm from converging. Additionally, suboptimal hyperparameter selection can cause fluctuations in training. Dynamic changes in the environment necessitate continuous adaptation by the algorithm, which may also contribute to performance metric variability. Lastly, an inadequately designed reward function can result in unstable rewards in certain states, thereby affecting the learning process.
Figure 6: Changes of the four metrics throughout the training process of the SDRL algorithm
To address these challenges, we propose increasing the number of training iterations to ensure sufficient convergence. Hyperparameters will be optimized using methods such as grid search or random search to identify more suitable combinations. Algorithmic stability will be enhanced by employing more robust variants or incorporating regularization terms to mitigate overfitting. As can be seen in Fig. 7, the SDRL algorithm obtains the highest long-term average returns, followed by SA-RL. Compared to the other algorithms, The SDRL algorithm surpasses other algorithms by 17% and 19.86% in terms of long-term average returns.
Figure 7: Performance evolution of distinct algorithms regarding sustained yield metrics
Therefore, both the long-term average gain and the service function chain request acceptance rate decay over time and eventually stabilize, as shown in Fig. 8.
Figure 8: Comparative analysis of petition approval rates across computational methods
As illustrated in Fig. 9, the long-term average benefit-to-cost ratio follows a stable trend, reflecting the profitability of the algorithm. This metric depends on the deployment scheme’s efficiency in using fewer physical links, independent of factors like network resources, security, and latency. The SDRL algorithm selects cost-effective deployment schemes, improving the benefit/cost ratio by 13.8% and 20.35% over the other algorithms.
Figure 9: Comparison of changes in long-term average revenue/cost ratios for different algorithms
In Fig. 10, the CPU resource utilization of SDRL ends up above 42%. The RL algorithm, based on reinforcement learning, neglects security constraints, while SA-RL considers node security but ignores dynamic trustworthiness changes, leading to resource wastage. These algorithms underperform due to insufficient consideration of constraints and dynamic changes. SDRL, by extracting dynamic trustworthiness and learning network attribute relationships, achieves more efficient deployment and higher resource utilization. In summary, the algorithm proposed in this paper demonstrates greater stability. Compared to other algorithms, it achieves up to a 27.96% improvement in CPU resource utilization.
Figure 10: Comparison of changes in CPU resource utilization for different algorithms
Service function chain deployment scheme performance: The SDRL algorithm performs effectively across several metrics, including long-term average gain, request acceptance rate, benefit-to-cost ratio, and CPU resource utilization. Relative to comparison algorithms, it leads by 17%–19.86% in long-term average revenue, increases the request acceptance rate by up to 14.49%, enhances the benefit-to-cost ratio by 13.8%–20.35%, and achieves a final CPU resource utilization rate exceeding 42%, with an improvement of up to 27.96%.
This paper introduces VRPBFT, an enhanced PBFT consensus mechanism that integrates Verifiable Random Function (VRF) and a node reputation level model. This mechanism effectively reduces consensus delay and improves the efficiency of Byzantine node detection, decreasing the proportion of faulty nodes by 40% after 100 rounds of consensus and reducing consensus delay by approximately 30% compared to traditional PBFT. Additionally, SDRL, a dynamic SFC deployment algorithm based on deep reinforcement learning, is designed to optimize resource allocation and enhance service quality through dynamic adjustments of node trust. It significantly outperforms existing algorithms in terms of long-term average revenue and other key performance indicators, such as a 17%–19.86% increase in long-term average revenue and a 14.49% improvement in request acceptance rate. The algorithm combines the reliability of blockchain with the trust of Byzantine nodes. The algorithm combines blockchain reliability and DRL intelligent decision-making capability, and excels in security, resource utilization and service quality, and can efficiently and reliably deploy SFCs under dynamic load, providing an effective solution for SFC deployment in heterogeneous networks. The algorithm has great advantages in enhancing security, optimizing resources, improving service quality and strong adaptability. Future work will focus on further optimizing the performance of the algorithm, expanding application scenarios, and enhancing security to cope with new security threats.
Acknowledgement: Thanks to the anonymous reviewers and editors for their hard work.
Funding Statement: This work is partially supported by the National Natural Science Foundation of China under Grant 62471493 and 62402257, partially supported by the Natural Science Foundation of Shandong Province under Grant ZR2023LZH017, ZR2024MF066 and 2023QF025, partially supported by the Open Research Subject of State Key Laboratory of Intelligent Game (No. ZBKF-24-12), partially supported by the Foundation of Key Laboratory of Education Informatization for Nationalities (Yunnan Normal University), the Ministry of Education (No. EIN2024C006), and partially supported by the Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE (No. 202306).
Author Contributions: The authors confirm their contribution to the paper as follows: Conceptualization and Design: Peiying Zhang, Jing Liu, Chong Lv; Methodology: Peiying Zhang; Software: Peiying Zhang, Lizhuang Tan; Investigation: Yihong Yu; Data Curation: Yihong Yu, Lizhuang Tan; Funding Acquisition: Peiying Zhang, Jing Liu, Yulin Zhang; Project Administration: Peiying Zhang, Jing Liu; Writing—Original Draft: Yihong Yu, Peiying Zhang, Chong Lv; Writing—Review & Editing: Yihong Yu, Jing Liu; Supervision: Chong Lv, Yulin Zhang. All authors reviewed the results and approved the final version of the manuscript.
Availability of Data and Materials: The general created dataset is available upon request.
Ethics Approval: This study did not involve any human or animal subjects, and therefore, ethical approval was not required.
Conflicts of Interest: The authors declare no conflicts of interest to report regarding the present study.
References
1. Guo S, Dai Y, Xu S, Qiu X, Qi F. Trusted cloud-edge network resource management: DRL-driven service function chain orchestration for IoT. IEEE Internet Things J. 2019;7(7):6010–22. doi:10.1109/JIOT.2019.2951593. [Google Scholar] [CrossRef]
2. Hantouti H, Benamar N, Taleb T. Service function chaining in 5G & beyond networks: challenges and open research issues. IEEE Netw. 2020;34(4):320–7. doi:10.1109/MNET.001.1900554. [Google Scholar] [CrossRef]
3. Mao Y, Shang X, Yang Y. Joint resource management and flow scheduling for SFC deployment in hybrid edge-and-cloud network. In: IEEE INFOCOM 2022-IEEE Conference on Computer Communications; 2022; London, UK: IEEE. p. 170–9. [Google Scholar]
4. Qu H, Wang K, Zhao J. Reliable service function chain deployment method based on deep reinforcement learning. Sensors. 2021;21(8):2733. doi:10.3390/s21082733. [Google Scholar] [PubMed] [CrossRef]
5. Torkzaban N, Baras JS. Trust-aware service function chain embedding: a path-based approach. In: 2020 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN); 2020; Leganes, Spain: IEEE. p. 31–6. [Google Scholar]
6. Wu TY, Wu H, Kumari S, Chen CM. An enhanced three-factor based authentication and key agreement protocol using PUF in IoMT. Peer Peer Netw Appl. 2025;18(2):83. doi:10.1007/s12083-024-01839-z. [Google Scholar] [CrossRef]
7. Tsai TT, Lin HY, Huang WN, Kumar S, Agarwal K, Chen CM. Anomaly detection through outsourced revocable identity-based signcryption with equality test for sensitive data in consumer IoT environments. IEEE Trans Consum Electron. 2024;1–1:300. [Google Scholar]
8. Guo S, Qi Y, Jin Y, Li W, Qiu X, Meng L. Endogenous trusted DRL-based service function chain orchestration for IoT. IEEE Trans Comput. 2021;71(2):397–406. doi:10.1109/TC.2021.3051681. [Google Scholar] [CrossRef]
9. Liu W, Zhang X, Feng W, Huang M, Xu Y. Optimization of PBFT algorithm based on QoS-aware trust service evaluation. Sensors. 2022;22(12):4590. doi:10.3390/s22124590. [Google Scholar] [PubMed] [CrossRef]
10. Zhao H, Deng S, Liu Z, Xiang Z, Yin J, Dustdar S, et al. DPoS: decentralized, privacy-preserving, and low-complexity online slicing for multi-tenant networks. IEEE Trans Mobile Comput. 2021;21(12):4296–309. doi:10.1109/TMC.2021.3074934. [Google Scholar] [CrossRef]
11. Luo J, Li J, Jiao L, Cai J. On the effective parallelization and near-optimal deployment of service function chains. IEEE Trans Parallel Distrib Syst. 2020;32(5):1238–55. doi:10.1109/TPDS.2020.3043768. [Google Scholar] [CrossRef]
12. Li W, Feng C, Zhang L, Xu H, Cao B, Imran MA. A scalable multi-layer PBFT consensus for blockchain. IEEE Trans Parallel Distrib Syst. 2020;32(5):1146–60. doi:10.1109/TPDS.2020.3042392. [Google Scholar] [CrossRef]
13. Manias DM, Shaer I, Naoum-Sawaya J, Shami A. Robust and reliable SFC placement in resource-constrained multi-tenant MEC-enabled networks. IEEE Trans Netw Serv Manag. 2024;21(4):187–99. doi:10.1109/TNSM.2023.3293027. [Google Scholar] [CrossRef]
14. Santos GL, Endo PT, Lynn T, Sadok D, Kelner J. A reinforcement learning-based approach for availability-aware service function chain placement in large-scale networks. Fut Gener Comput Syst. 2022;136(1):93–109. doi:10.1016/j.future.2022.05.021. [Google Scholar] [CrossRef]
15. Cao Y, Jia Z, Dong C, Wang Y, You J, Wu Q. SFC deployment in space-air-ground integrated networks based on matching game. In: IEEE INFOCOM 2023—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS); 2023; New York, NY, USA: IEEE. p. 1–6. [Google Scholar]
16. Bhamare D, Jain R, Samaka M, Erbad A. A survey on service function chaining. J Netw Comput Appl. 2016;75(4):138–55. doi:10.1016/j.jnca.2016.09.001. [Google Scholar] [CrossRef]
17. Zhu Y, Yao H, Mai T, He W, Zhang N, Guizani M. Multiagent reinforcement-learning-aided service function chain deployment for internet of things. IEEE Internet Things J. 2022;9(17):15674–84. doi:10.1109/JIOT.2022.3151134. [Google Scholar] [CrossRef]
18. Chen J, Chen J, Zhang H. DRL-QOR: deep reinforcement learning-based QoS/QoE-aware adaptive online orchestration in NFV-enabled networks. IEEE Trans Netw Serv Manag. 2021;18(2):1758–74. doi:10.1109/TNSM.2021.3055494. [Google Scholar] [CrossRef]
19. Sun G, Chen Z, Yu H, Du X, Guizani M. Online parallelized service function chain orchestration in data center networks. IEEE Access. 2019;7:100147–61. doi:10.1109/ACCESS.2019.2930295. [Google Scholar] [CrossRef]
20. Khoshkholghi MA, Mahmoodi T. Edge intelligence for service function chain deployment in NFV-enabled networks. Comput Netw. 2022;219(2):109451. doi:10.1016/j.comnet.2022.109451. [Google Scholar] [CrossRef]
21. Juan Y, Chaoqing X, Yize Z, Feifan S. Synchronous deep reinforcement learning (SDRL) algorithm for small batch image recognition. In: 2022 8th International Conference on Big Data and Information Analytics (BigDIA); 2022; Guiyang, China: IEEE. p. 317–23. [Google Scholar]
22. Meng W, Li W, Zhu L. Enhancing medical smartphone networks via blockchain-based trust management against insider attacks. IEEE Trans Eng Manag. 2019;67(4):1377–86. doi:10.1109/TEM.2019.2921736. [Google Scholar] [CrossRef]
23. Liu Z, Xu X, Qiao P, Li D. Acceleration for deep reinforcement learning using parallel and distributed computing: a survey. ACM Comput Surv 2024;57(4):1–35. [Google Scholar]
24. Hu Y, Li T. Enabling efficient network service function chain deployment on heterogeneous server platform. In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA); 2018; Vienna, Austria: IEEE. p. 27–39. [Google Scholar]
25. Sheng G, Min M, Xiao L, Liu S. Reinforcement learning-based control for unmanned aerial vehicles. J Commun Infor Netw. 2018;3(3):39–48. [Google Scholar]
26. Stooke A, Abbeel P. Accelerated methods for deep reinforcement learning. arXiv:180302811. 2018. [Google Scholar]
27. Yao H, Chen X, Li M, Zhang P, Wang L. A novel reinforcement learning algorithm for virtual network embedding. Neurocomputing. 2018;284(2):1–9. doi:10.1016/j.neucom.2018.01.025. [Google Scholar] [CrossRef]
28. Zhang P, Wang C, Jiang C, Benslimane A. Security-aware virtual network embedding algorithm based on reinforcement learning. IEEE Trans Netw Sci Eng. 2020;8(2):1095–105. doi:10.1109/TNSE.2020.2995863. [Google Scholar] [CrossRef]
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.