Open Access
ARTICLE
Efficient Concurrent L1-Minimization Solvers on GPUs
Xinyue Chu1, Jiaquan Gao1,*, Bo Sheng2
1 Jiangsu Key Laboratory for NSLSCS, School of Computer and Electronic Information, Nanjing Normal University, Nanjing 210023, China
2 Department of Computer Science, University of Massachusetts Boston, MA 02125, USA
* Corresponding Author: Jiaquan Gao. Email:
Computer Systems Science and Engineering 2021, 38(3), 305-320. https://doi.org/10.32604/csse.2021.017144
Received 20 January 2021; Accepted 22 February 2021; Issue published 19 May 2021
Abstract
Given that the concurrent L1-minimization (L1-min) problem is often required in some real applications, we investigate how to solve it in parallel on GPUs in this paper. First, we propose a novel self-adaptive warp implementation of the matrix-vector multiplication (
Ax) and a novel self-adaptive thread implementation of the matrix-vector multiplication (
ATx), respectively, on the GPU. The vector-operation and inner-product decision trees are adopted to choose the optimal vector-operation and inner-product kernels for vectors of any size. Second, based on the above proposed kernels, the iterative shrinkage-thresholding algorithm is utilized to present two concurrent L1-min solvers from the perspective of the streams and the thread blocks on a GPU, and optimize their performance by using the new features of GPU such as the shuffle instruction and the read-only data cache. Finally, we design a concurrent L1-min solver on multiple GPUs. The experimental results have validated the high effectiveness and good performance of our proposed methods.
Keywords
Cite This Article
X. Chu, J. Gao and B. Sheng, "Efficient concurrent l1-minimization solvers on gpus,"
Computer Systems Science and Engineering, vol. 38, no.3, pp. 305–320, 2021. https://doi.org/10.32604/csse.2021.017144