Table of Content

Open Access iconOpen Access



Millimeter-Wave Concurrent Beamforming: A Multi-Player Multi-Armed Bandit Approach

Ehab Mahmoud Mohamed1, 2, *, Sherief Hashima3, 4, Kohei Hatano3, 5, Hani Kasban4, Mohamed Rihan6

1 Electrical Engineering Department, College of Engineering, Prince Sattam Bin Abdulaziz University, Wadi Addwasir, 11991, Saudi Arabia.
2 Electrical Engineering Department, Faculty of Engineering, Aswan University, Aswan, 81542, Egypt.
3 Computational Learning Theory Team, RIKEN-Advanced Intelligent Project, Fukuoka, 819-0395, Japan.
4 Engineering Department, Nuclear Research Center, Egyptian Atomic Energy Authority, Cairo, 13759, Egypt.
5 Faculty of Arts and Science, Kyushu University, Fukuok, 819-0395, Japan.
6 Electronics and Electrical Communication Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf, 32952, Egypt.

* Corresponding Author: Ehab Mahmoud Mohamed. Email: email.

Computers, Materials & Continua 2020, 65(3), 1987-2007.


The communication in the Millimeter-wave (mmWave) band, i.e., 30~300 GHz, is characterized by short-range transmissions and the use of antenna beamforming (BF). Thus, multiple mmWave access points (APs) should be installed to fully cover a target environment with gigabits per second (Gbps) connectivity. However, inter-beam interference prevents maximizing the sum rates of the established concurrent links. In this paper, a reinforcement learning (RL) approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links. Specifically, the problem is formulated as a multiplayer multiarmed bandit (MAB), where mmWave APs act as the players aiming to maximize their achievable rewards, i.e., data rates, and the arms to play are the available beam directions. In this setup, a selfish concurrent multiplayer MAB strategy is advocated. Four different MAB algorithms, namely, ϵ-greedy, upper confidence bound (UCB), Thompson sampling (TS), and exponential weight algorithm for exploration and exploitation (EXP3) are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations. After a few rounds of interactions, mmWave APs learn how to select concurrent beams that enhance the overall system performance. The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution.


Cite This Article

APA Style
Mohamed, E.M., Hashima, S., Hatano, K., Kasban, H., Rihan, M. (2020). Millimeter-wave concurrent beamforming: A multi-player multi-armed bandit approach. Computers, Materials & Continua, 65(3), 1987-2007.
Vancouver Style
Mohamed EM, Hashima S, Hatano K, Kasban H, Rihan M. Millimeter-wave concurrent beamforming: A multi-player multi-armed bandit approach. Comput Mater Contin. 2020;65(3):1987-2007
IEEE Style
E.M. Mohamed, S. Hashima, K. Hatano, H. Kasban, and M. Rihan "Millimeter-Wave Concurrent Beamforming: A Multi-Player Multi-Armed Bandit Approach," Comput. Mater. Contin., vol. 65, no. 3, pp. 1987-2007. 2020.


cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2534


  • 1435


  • 0


Share Link