Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.075819
Special Issues
Table of Content

Open Access

ARTICLE

ARQ–UCB: A Reinforcement-Learning Framework for Reliability-Aware and Efficient Spectrum Access in Vehicular IoT

Adeel Iqbal1,#, Tahir Khurshaid2,#, Syed Abdul Mannan Kirmani3, Mohammad Arif4,*, Muhammad Faisal Siddiqui5,*
1 School of Computer Science and Engineering, Yeungnam University, Gyeongsan-si, Republic of Korea
2 Department of Electrical Engineering, Yeungnam University, Gyeongsan-si, Republic of Korea
3 Department of Computer Engineering, COMSATS University Islamabad, Islamabad, Pakistan
4 Department of Computer Engineering, Gachon University, Seongnam-si, Republic of Korea
5 Department of Computer Engineering, College of Computer Sciences and Information Technology, King Faisal University, Al Ahsa, Saudi Arabia
* Corresponding Author: Mohammad Arif. Email: email; Muhammad Faisal Siddiqui. Email: email
# These authors contributed equally to this work
(This article belongs to the Special Issue: Advances in Vehicular Ad-Hoc Networks (VANETs) for Intelligent Transportation Systems)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.075819

Received 09 November 2025; Accepted 05 January 2026; Published online 29 January 2026

Abstract

Vehicular Internet of Things (V-IoT) networks need intelligent and adaptive spectrum access methods for ensuring ultra-reliable and low-latency communication (URLLC) in highly dynamic environments. Traditional reinforcement learning (RL)-based algorithms, such as Q-Learning and Double Q-Learning, are often characterized by unstable convergence and inefficient exploration in the presence of stochastic vehicular traffic and interference. This paper proposes Adaptive Reinforcement Q-learning with Upper Confidence Bound (ARQ-UCB), a lightweight and reliability-aware RL framework, which explicitly reduces interruption and blocking probabilities while improving throughput and delay across diverse vehicular traffic conditions. This proposed ARQ-UCB algorithm extends the basic Q-updates with an exploration confidence term able to dynamically balance exploration and exploitation based on uncertainty estimates, hence allowing faster convergence in case of bursty vehicular traffic. A comprehensive simulation framework evaluates throughput, delay, fairness, energy efficiency, and computational complexity in several V-IoT scenarios. Obtained results indicate that ARQ–UCB attains substantial gains in terms of throughput, fairness, and blocking/delay probabilities while retaining sub-20 μs decision latency and

Keywords

V-IoT; RL; Q-Learning; upper confidence bound; spectrum access; URLLC; 5G/6G
  • 48

    View

  • 8

    Download

  • 0

    Like

Share Link