Open Access
ARTICLE
GPU Usage Time-Based Ordering Management Technique for Tasks Execution to Prevent Running Failures of GPU Tasks in Container Environments
1 Department of Computer Engineering, Jeju National University, Jeju-do, 63243, Republic of Korea
2 Department of Computer Science, Korea National Open University, Seoul, 03087, Republic of Korea
* Corresponding Author: Jihun Kang. Email:
Computers, Materials & Continua 2025, 82(2), 2199-2213. https://doi.org/10.32604/cmc.2025.061182
Received 19 November 2024; Accepted 31 December 2024; Issue published 17 February 2025
Abstract
In a cloud environment, graphics processing units (GPUs) are the primary devices used for high-performance computation. They exploit flexible resource utilization, a key advantage of cloud environments. Multiple users share GPUs, which serve as coprocessors of central processing units (CPUs) and are activated only if tasks demand GPU computation. In a container environment, where resources can be shared among multiple users, GPU utilization can be increased by minimizing idle time because the tasks of many users run on a single GPU. However, unlike CPUs and memory, GPUs cannot logically multiplex their resources. Additionally, GPU memory does not support over-utilization: when it runs out, tasks will fail. Therefore, it is necessary to regulate the order of execution of concurrently running GPU tasks to avoid such task failures and to ensure equitable GPU sharing among users. In this paper, we propose a GPU task execution order management technique that controls GPU usage via time-based containers. The technique seeks to ensure equal GPU time among users in a container environment to prevent task failures. In the meantime, we use a deferred processing method to prevent GPU memory shortages when GPU tasks are executed simultaneously and to determine the execution order based on the GPU usage time. As the order of GPU tasks cannot be externally adjusted arbitrarily once the task commences, the GPU task is indirectly paused by pausing the container. In addition, as container pause/unpause status is based on the information about the available GPU memory capacity, overuse of GPU memory can be prevented at the source. As a result, the strategy can prevent task failure and the GPU tasks can be experimentally processed in appropriate order.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.