Open Access
ARTICLE
Optimized Metaheuristic Strategies for Addressing the Multi-Picker Robot Routing Problem in 3D Warehouse Operations
School of Information Communications and Technology, Hanoi University of Industry, Hanoi, 100000, Vietnam
* Corresponding Authors: Thi Ngoc Huyen Do. Email: or
or
# These authors contributed equally to this work as the first author
(This article belongs to the Special Issue: Particle Swarm Optimization: Advances and Applications)
Computers, Materials & Continua 2025, 84(3), 5063-5076. https://doi.org/10.32604/cmc.2025.064610
Received 19 February 2025; Accepted 18 June 2025; Issue published 30 July 2025
Abstract
Efficient warehouse management is critical for modern supply chain systems, particularly in the era of e-commerce and automation. The Multi-Picker Robot Routing Problem (MPRRP) presents a complex challenge involving the optimization of routes for multiple robots assigned to retrieve items from distinct locations within a warehouse. This study introduces optimized metaheuristic strategies to address MPRRP, with the aim of minimizing travel distances, energy consumption, and order fulfillment time while ensuring operational efficiency. Advanced algorithms, including an enhanced Particle Swarm Optimization (PSO-MPRRP) and a tailored Genetic Algorithm (GA-MPRRP), are specifically designed with customized evolutionary operators to effectively solve the MPRRP. Comparative experiments are conducted to evaluate the proposed strategies against benchmark approaches, demonstrating significant improvements in solution quality and computational efficiency. The findings contribute to the development of intelligent, scalable, and environmentally friendly warehouse systems, paving the way for future advances in robotics and automated logistics management.Keywords
In today’s rapidly evolving logistics landscape, efficient warehouse management is pivotal to maintaining a responsive and cost-effective supply chain. Warehouses are integral components of supply chains, performing critical tasks such as receiving, storage, picking, packing, and shipping of goods [1]. Among these, order picking is notably the most labor-intensive and costly process, contributing to approximately 60% of total operational costs [2]. As such, optimizing the order picking process, particularly in terms of minimizing picking time and travel distance, is essential for enhancing warehouse performance and customer satisfaction.
To meet the increasing demands of e-commerce and same-day delivery services, the integration of autonomous mobile robots has become a transformative trend in warehouse operations [3,4]. These robots enable faster, more accurate, and scalable order fulfillment. However, coordinating multiple robots in a shared environment, while ensuring minimal travel distance and balanced workloads, poses a complex optimization challenge-formally known as the Multi-Picker Robot Routing Problem. The MPRRP, which requires determining optimal routes for multiple robots under constraints such as energy, task allocation, and time, is an nondeterministic polynomial time-hard (NP-Hard) problem [5].
Despite extensive research into picker routing, most existing studies focus on 2D warehouse layouts and either single-picker scenarios or basic multi-robot coordination [6–8]. These models often oversimplify real warehouse environments, where racks, shelves, and pathways span multiple levels. In practice, many modern warehouses adopt vertical storage systems, effectively forming a three-dimensional (3D) layout to maximize space utilization. This introduces new challenges in robot coordination, vertical movement (e.g., elevators, lifts), and dynamic routing planning.
Moreover, while several works have applied metaheuristic algorithms such as Genetic Algorithms (GA) and Particle Swarm Optimization (PSO) to warehouse routing, these approaches often lack tailored strategies for 3D navigation or realistic energy modeling. The coupling of routing scheduling with energy-aware decision-making in multi-robot systems within a 3D warehouse environment remains an underexplored research direction.
This paper aims to bridge these gaps by making the following key contributions:
• We present a novel 3D mathematical model for the MPRRP that captures the spatial complexity of multi-level warehouse environments, including aisle structure, shelf height, and robot movement across vertical and horizontal planes.
• We propose two optimized metaheuristic algorithms, GA-MPRRP and PSO-MPRRP, which integrate problem-specific evolution operators and encoding schemes to effectively solve the MPRRP in 3D.
• We demonstrate, through extensive simulation, that our methods outperform conventional techniques in terms of travel distance, energy efficiency, and computational scalability, especially under large-scale warehouse settings.
The remainder of this paper is organized as follows: Section 2 reviews related work in robotic routing and metaheuristic optimization. Section 3 details the problem formulation and constraints. Section 4 describes the proposed metaheuristic strategies, while Section 5 presents experimental results and analysis. Finally, Section 6 concludes the paper and highlights future research directions.
In this section, we present the literature on warehousing problems related to our work as multi-picker robot routing problem (MPRRP).
The picker-routing problem in warehouses with 2D layouts is a well-known NP-Hard problem [1], making it particularly challenging to solve. Various methods have been proposed to tackle this problem, including exact algorithms [9], heuristics and metaheuristics [7,8,10], genetic algorithms, and stochastic models. These approaches are typically applied individually or in combination to improve computational efficiency. A comprehensive review by [6] highlights that most existing studies focus on conventional 2D warehouse layouts and often consider simplified settings, such as single-picker routing. Notably, exact methods remain rare due to their computational cost, especially for large-scale problems. For instance, study [9] introduces two exact algorithms that significantly improve a mixed-integer programming formulation and generalize existing dynamic programming approaches; however, these methods suffer from scalability issues. Other studies such as [7] and [8] emphasize heuristic-based strategies to address order picking with predefined routing policies or height-level constraints in high-bay systems. While these works demonstrate effective results in specific 2D settings, they are generally limited to single-picker scenarios and do not extend to full three-dimensional (3D) warehouse modeling.
In contrast, research on picker-routing in 3D environments remains limited. One relevant study [11] investigates a three-dimensional vehicle routing problem for fuel delivery and proposes a 3D Ant Colony Optimization algorithm. However, this work is outside the warehouse domain and focuses on vehicle routing rather than intra-warehouse picking by robots.
Recent developments in robot routing planning have addressed 2D multi-objective routing problems using evolutionary algorithms. For example, study [12] employs an improved NSGA-II algorithm, while study [13] introduces a self-learning mechanism based on an enhanced artificial bee colony algorithm. These studies contribute valuable algorithmic insights but do not consider warehouse-specific constraints or 3D environments with multiple pickers.
Despite the growing importance of automation in warehouse operations, research explicitly targeting the multi-picker robot routing problem in three-dimensional warehouse layouts remains scarce. To bridge this gap, our study proposes a novel problem formulation that jointly considers multi-robot coordination and spatial complexity in 3D warehouse environments. Furthermore, we introduce optimized metaheuristic strategies to enhance solution quality and scalability. This contribution is twofold: (i) a new mathematical model for multi-robot routing in 3D warehouses, and (ii) an effective algorithmic framework tailored to this complex logistics problem.
3 Preliminaries and Problem Formulation
The MPRRP addresses the challenge of minimizing the total energy consumption of multiple picker robots tasked with retrieving required products from a structured parallel 3D warehouse layout, as shown in Fig. 1a. The warehouse consists of parallel shelves arranged systematically, with each shelf divided into multiple vertical layers. Each layer is further segmented into smaller storage cells, each containing a different quantity of a single product. A product type can be stored in multiple cells. A counter is set up in the warehouse to collect all picked products. Picker robots are tasked with retrieving items from specific cells based on predefined order lists, navigating in 3D space along the

Figure 1: Warehouse layout with 4 shelves, 3 layers in a shelf and 5 cells in a layer
Given S parallel shelves, with
Furthermore,
Let the counter be an additional cell,
where:
Consider R picker robots, each beginning at the counter to collect the required products. Every robot has a maximum capacity
The energy consumption of each robot includes the energy used for moving on the floor and the energy for going up and down. The moving energy consumption
where
where
The energy consumption
where
The total energy E consumed by all picker robots in the routing, while ensuring that all required products are picked up and brought to the counter, with the weight of all products in picker robot
where
The problem can be stated as follows: A 3D warehouse layout consists of parallel shelves with multiple layers and cells, where each cell contains a specific type of product with varying quantities and weights, as illustrated in Fig. 1a. Products may be distributed across multiple cells. A list of required products must be collected by a team of picker robots. Each robot starts at the counter, retrieves the assigned products, and returns either after collecting all required items or when its maximum capacity is reached. The robot’s route is determined by the sequence of cells it visits. During operation, robots move along the floor, lift products between layers, and wait at the counter–each action consuming different amounts of energy. The goal is to determine the optimal routing for all robots that ensures the collection of all required products, satisfies the product demand constraints in Eq. (14), the robot operation conditions in Eqs. (15) and (18), and the capacity constraints in Eqs. (16), (17), and (19), while minimizing the total energy consumption as defined in Equation (specify equation number here if applicable).
The mathematical model of the problem is as follows.
Input:
•
• P: Number of product types in warehouse.
•
•
•
•
•
Output:
Objective function:
Constraints:
where
where
Metaheuristic algorithms are powerful and flexible tools for solving diverse optimization problems [14–17]. They iteratively guide the search process by combining exploration and exploitation strategies [18]. A key challenge lies in designing problem-specific operators. In this section, we detail two tailored algorithms developed to effectively solve the MPRRP.
An individual (particle) is represented by a one-dimensional array, called a cell array, of length A, which corresponds to the number of cells storing the required product type. The array is expressed as

Figure 2: Demonstration of solution encoding and decoding
The greedy algorithm is used to decode an individual into multiple routes for each robot, considering the robot’s maximum capacity and the product weights in the tour. After completing a tour, the robot returns to the counter and begins a new tour. The sequence of elements in the cell array determines the order of cell visits in the robot’s tour. As shown in Fig. 2b, the individual is decoded into three tours:
PSO-MPRRP, described in Algorithm 1, utilizes the XOR method to enhance exploration and exploitation. First,

Figure 3: Demonstration of PSO-MPRRP
The XOR method
The XOR method between two permutation positions applies an element-wise comparison operation. For each corresponding pair of elements at the same position index in two permutations, the operation sets the resulting value to 1 if the elements differ and to 0 if they are identical. This creates a binary mask indicating positions where the two permutations differ, as illustrated in Fig. 3a,b. For example in Fig. 3a, comparing positions [3, 4, 1, 2, 5, 6] and [2, 1, 4, 5, 3, 6] would yield [1, 1, 1, 1, 1, 0], indicating that elements at positions 1 to 5 are different. This binary representation enables the algorithm to identify which positions should be influenced by personal best (
The update of velocity
To update
The update of new position
The order crossover operator is applied to three parents: the current position,

The GA-MPRRP employs the crossover operator, mutation operator, and selection policy on the population in each generation, as described in Algorithm 2. The Multi-Point Order Crossover (MOX) operator ensures order preservation by selecting multiple crossover points and maintaining unique elements in the offspring. Additionally, swap mutation enhances exploration and prevents premature convergence. Selection is performed using the roulette wheel method, balancing the exploitation of high-quality solutions with the preservation of genetic diversity. This combination of MOX crossover, swap mutation, and roulette wheel selection enables an efficient search within complex solution spaces. These techniques are particularly effective for solving MPRRP problems, where maintaining solution integrity is crucial for convergence to optimal solutions.

The crossover operator
The MOX is used in GA-MPRRP to generate two offspring from two parents. After selecting two random points,

Figure 4: Illustration of the evolution operators GA-MPRRP
The mutation operator
The mutation operator swaps two random cells in the individual, as shown in Fig. 4b, where the second and fifth elements are swapped. This preserves the permutation structure, ensures a valid individual, and increases population diversity. While excessive use can disrupt high-quality solutions, when balanced, it helps prevent premature convergence and improves the algorithm’s ability to find optimal or near-optimal solutions.
The selection operator
In GA-MPRRP, roulette selection updates the population by selecting individuals based on fitness-proportional probabilities. A cumulative probability distribution is used, mapping random values to select individuals. This method favors fitter solutions while maintaining population diversity for effective optimization.
The complexity of the proposed PSO-MPRRP and GA-MPRRP algorithms is analyzed with
• Initialization: Evaluating
• Iteration: Each of
Total fitness evaluations:
The experimental data, presented in Table 1, includes parameters representing various warehouse and robot operation factors. These factors are essential for evaluating and optimizing the performance of metaheuristic algorithms in solving the multi-picker robot routing challenge. By varying these parameters, the paper examines the efficiency and effectiveness of the proposed strategies under different operational conditions.

The experiments were performed on a computer featuring an Intel processor with a 1 GHz clock speed and 16 GB of RAM, providing adequate computational resources for testing. The implementation was done in Python, with the environment set up to efficiently support multiple runs, ensuring consistent and reliable results. All parameters of GA-MPRRP and PSO-MPRRP are set in detail in Tables 2 and 3, respectively.


Visualization of the tours is presented in Fig. 5. The fitness improved 29% from generation 0 to 54 (23,344 to 16,608) in the data file

Figure 5: Visualization tours of 2 robots in the data file
Test the optimal parameters for the proposed algorithms: GA-MPRRP and PSO-MPRRP
Fig. 6a shows that the 0.6–0.4 combination (yellow line) achieves the lowest energy consumption, particularly after 20 generations, consistently outperforming other combinations. This indicates that prioritizing the particle’s own experience (c1 = 0.7) over the swarm’s experience (c2 = 0.3) leads to more effective optimization. Thus, c1 = 0.7 and c2 = 0.3 is the optimal setting for minimizing energy consumption. Fig. 6b demonstrates that a mutation rate of 0.6 with a crossover rate of 1 yields optimal performance in the GA-MPRRP. Fig. 6c shows that a swarm size of 20 (blue dotted line) results in the lowest energy consumption after about 30 s and maintains it throughout the simulation. Swarm sizes of 30, 40, and 50 improve energy consumption but do not match the efficiency of size 20. A swarm size of 10 performs well initially but plateaus at a higher energy level. Therefore, a swarm size of 20 is the most effective for minimizing energy consumption in this PSO implementation. Fig. 6d shows that a population size of 20 (blue dotted line) achieves the lowest energy consumption after 20 s and maintains it throughout the simulation. A population size of 10 improves quickly but plateaus at a higher energy level, while larger sizes (30, 40, and 50) lead to higher energy consumption and slower convergence. Thus, a population size of 20 is the most effective for minimizing energy consumption in this GA implementation.

Figure 6: Energy consumption over generations and time for different parameters in the data file
Evaluate the performance of the two proposed algorithms under varying parameter settings
Fig. 7a compares the convergence of GA-MPRRP and PSO-MPRRP in terms of energy consumption over time. While PSO-MPRRP initially converges faster and achieves lower energy consumption, GA-MPRRP surpasses it after about 30 s, ultimately reaching a lower final energy consumption. This indicates that GA-MPRRP offers better long-term optimization, despite PSO-MPRRP’s quicker early convergence. Fig. 7b compares GA-MPRRP and PSO-MPRRP in terms of energy consumption and result consistency across varying robot counts. GA-MPRRP consistently achieves lower energy consumption and exhibits smaller error bars, indicating more stable performance. While PSO-MPRRP shows a trend of reduced energy consumption with more robots, its higher standard deviation indicates greater variability. Thus, GA-MPRRP outperforms PSO-MPRRP in both energy efficiency and robustness.

Figure 7: Energy consumption calculated by PSO-MPRRP and GA-MPRRP algorithm
This paper addresses MPRRP in warehouse operations, a crucial challenge for optimizing modern logistics and supply chains. We proposed a mathematical model and developed two metaheuristic algorithms, GA-MPRRP and PSO-MPRRP, to solve the problem efficiently. These approaches optimize robot routes, reduce operational costs, and minimize execution time. Experimental results validate the effectiveness of the proposed methods, with GA-MPRRP demonstrating superior solution quality through exploration, and PSO-MPRRP achieving faster convergence and computational efficiency. Despite these promising results, several limitations remain. The computational scalability of the algorithms in very large or real-time warehouse environments poses challenges. Moreover, the current evaluation relies on synthetic data, and the energy model used is simplified compared to real-world robotic dynamics. These factors may impact the direct applicability of the approach in highly dynamic or complex warehouse systems. Future work will explore hybridization with other metaheuristics, integration with learning-based methods for adaptive decision-making, and validation using real-world warehouse datasets. These directions aim to enhance the adaptability, robustness, and practical deployment of the proposed algorithms. In summary, GA-MPRRP and PSO-MPRRP represent a significant step forward in solving the MPRRP, offering scalable and intelligent solutions for enhancing the efficiency of automated warehouse operations.
Acknowledgement: This research was sponsored by Hanoi University of Industry, Hanoi, Vietnam.
Funding Statement: This research is funded by Hanoi University of Industry, Hanoi, Vietnam, under contract number 25−2024−RD/HD−DHCN.
Author Contributions: Thi My Binh Nguyen: methodology, supervision, software, writing—original draft, editing, and funding acquisition. Thi Ngoc Huyen Do: methodology, resource, writing—original draft, and editing. Thi Hoa Hue Nguyen: methodology, software, writing—original draft, funding acquisition. All authors reviewed the results and approved the final version of the manuscript.
Availability of Data and Materials: Data will be made available on request.
Ethics Approval: The study does not include human or animal subjects and is approved by the committee.
Conflicts of Interest: The authors declare no conflicts of interest to report regarding the present study.
References
1. Boysen N, De Koster R, Weidinger F. Warehousing in the e-commerce era: a survey. Eur J Oper Res. 2019;277(2):396–411. doi:10.1016/j.ejor.2018.08.023. [Google Scholar] [CrossRef]
2. Casella G, Volpi A, Montanari R, Tebaldi L, Bottani E. Trends in order picking: a 2007–2022 review of the literature. Prod Manuf Res. 2023;11(1):2191115. [Google Scholar]
3. Yu S, Puchinger J, Sun S. Van-based robot hybrid pickup and delivery routing problem. Eur J Oper Res. 2022;298(3):894–914. doi:10.1016/j.ejor.2021.06.009. [Google Scholar] [CrossRef]
4. Zhang S, Zhuge D, Tan Z, Zhen L. Order picking optimization in a robotic mobile fulfillment system. Expert Syst Appl. 2022;209:118338. doi:10.1016/j.eswa.2022.118338. [Google Scholar] [CrossRef]
5. Shah B, Khanzode V. A comprehensive review of warehouse operational issues. Int J Logist Syst Manage. 2017;26(3):346–78. doi:10.1504/ijlsm.2017.081962. [Google Scholar] [CrossRef]
6. Masae M, Glock CH, Grosse EH. Order picker routing in warehouses: a systematic literature review. Int J Prod Econ. 2020;224:107564. doi:10.1016/j.ijpe.2019.107564. [Google Scholar] [CrossRef]
7. Silva A, Coelho LC, Darvish M, Renaud J. Integrating storage location and order picking problems in warehouse planning. Transp Res Part E Logist Transp Rev. 2020;140:102003. doi:10.1016/j.tre.2020.102003. [Google Scholar] [CrossRef]
8. Cano JA, Cortés P, Muñuzuri J, Correa-Espinal A. Solving the picker routing problem in multi-block high-level storage systems using metaheuristics. Flex Serv Manuf J. 2023;35(2):376–415. doi:10.1007/s10696-022-09445-y. [Google Scholar] [CrossRef]
9. Pansart L, Catusse N, Cambazard H. Exact algorithms for the order picking problem. Comput Oper Res. 2018;100:117–27. [Google Scholar]
10. Haouassi M, Kergosien Y, Mendoza JE, Rousseau LM. The picker routing problem in mixed-shelves, multi-block warehouses. Int J Product Res. 2025;63(4):1304–25. doi:10.1080/00207543.2024.2374845. [Google Scholar] [CrossRef]
11. Guo N, Qian B, Na J, Hu R, Mao JL. A three-dimensional ant colony optimization algorithm for multi-compartment vehicle routing problem considering carbon emissions. Appl Soft Comput. 2022;127:109326. doi:10.1016/j.asoc.2022.109326. [Google Scholar] [CrossRef]
12. Duan P, Yu Z, Gao K, Meng L, Han Y, Ye F. Solving the multi-objective path planning problem for mobile robot using an improved NSGA-II algorithm. Swarm Evol Comput. 2024;87:101576. doi:10.1016/j.swevo.2024.101576. [Google Scholar] [CrossRef]
13. Ye F, Duan P, Meng L, Sang H, Gao K. An enhanced artificial bee colony algorithm with self-learning optimization mechanism for multi-objective path planning problem. Eng Appl Artif Intell. 2025;149:110444. doi:10.1016/j.engappai.2025.110444. [Google Scholar] [CrossRef]
14. Binh NTM, Ngoc NH, Binh HTT, Van NK, Yu S. A family system based evolutionary algorithm for obstacle-evasion minimal exposure path problem in Internet of Things. Expert Syst Appl. 2022;200:116943. doi:10.1016/j.eswa.2022.116943. [Google Scholar] [CrossRef]
15. Binh NTM, Hoang Long D, Ngoc N, Thanh Binh HT, Phuong NK. Investigate evolutionary strategies for black-box attacks to deepfake forensic systems. In: Proceedings of the 11th International Symposium on Information and Communication Technology. Hanoi, Vietnam; 2022. p. 126–33. [Google Scholar]
16. Binh NTM, Thien NV, Luong HVD, Ngoc DT. An efficient approach to the k-strong barrier coverage problem under the probabilistic sensing model in wireless multimedia sensor networks. In: Ad hoc networks. Cham: Springer Nature Switzerland; 2024. p. 167–80. [Google Scholar]
17. Huong TT, Van Cuong L, Hai NM, Le NP, Vinh LT, BinhHTT. A bi-level optimized charging algorithm for energy depletion avoidance in wireless rechargeable sensor networks. Appl Intell. 2022;52(6):6812–34. doi:10.1007/s10489-021-02775-8. [Google Scholar] [CrossRef]
18. Hussain K, Mohd Salleh MN, Cheng S, Shi Y. Metaheuristic research: a comprehensive survey. Artif Intell Rev. 2019;52:2191–233. doi:10.1007/s10462-017-9605-z. [Google Scholar] [CrossRef]
Cite This Article
Copyright © 2025 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools