Open Access
ARTICLE
Effective Controller Placement in Software-Defined Internet-of-Things Leveraging Deep Q-Learning (DQL)
1 Department of AI Convergence Network, Ajou University, Suwon, 16499, Republic of Korea
2 Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, 11362, Saudi Arabia
* Corresponding Author: Jehad Ali. Email:
Computers, Materials & Continua 2024, 81(3), 4015-4032. https://doi.org/10.32604/cmc.2024.058480
Received 13 September 2024; Accepted 04 November 2024; Issue published 19 December 2024
Abstract
The controller is a main component in the Software-Defined Networking (SDN) framework, which plays a significant role in enabling programmability and orchestration for 5G and next-generation networks. In SDN, frequent communication occurs between network switches and the controller, which manages and directs traffic flows. If the controller is not strategically placed within the network, this communication can experience increased delays, negatively affecting network performance. Specifically, an improperly placed controller can lead to higher end-to-end (E2E) delay, as switches must traverse more hops or encounter greater propagation delays when communicating with the controller. This paper introduces a novel approach using Deep Q-Learning (DQL) to dynamically place controllers in Software-Defined Internet of Things (SD-IoT) environments, with the goal of minimizing E2E delay between switches and controllers. E2E delay, a crucial metric for network performance, is influenced by two key factors: hop count, which measures the number of network nodes data must traverse, and propagation delay, which accounts for the physical distance between nodes. Our approach models the controller placement problem as a Markov Decision Process (MDP). In this model, the network configuration at any given time is represented as a “state,” while “actions” correspond to potential decisions regarding the placement of controllers or the reassignment of switches to controllers. Using a Deep Q-Network (DQN) to approximate the Q-function, the system learns the optimal controller placement by maximizing the cumulative reward, which is defined as the negative of the E2E delay. Essentially, the lower the delay, the higher the reward the system receives, enabling it to continuously improve its controller placement strategy. The experimental results show that our DQL-based method significantly reduces E2E delay when compared to traditional benchmark placement strategies. By dynamically learning from the network’s real-time conditions, the proposed method ensures that controller placement remains efficient and responsive, reducing communication delays and enhancing overall network performance.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.