Table of Content

Open Access

ARTICLE

A Dynamic Memory Allocation Optimization Mechanism Based on Spark

Suzhen Wang1, Shanshan Geng1, Zhanfeng Zhang1, Anshan Ye2, Keming Chen2, Zhaosheng Xu2, Huimin Luo2, Gangshan Wu3,*, Lina Xu4, Ning Cao5 College of Information Technology, Hebei University of Economics and Business, Shijiazhuang, 050061, China.
College of Mathematics and Computer Science, Xinyu University, Xinyu, 338004, China.
School of Information Engineering, Jiangsu Polytechnic College of Agriculture and Forestry, Jurong, 212400, China.
School of Computer Science, University College Dublin, Dublin 4, Ireland.
College of Information Engineering, Sanming University, Sanming, 365004, China.
*Corresponding Author: Gangshan Wu. Email: .

Computers, Materials & Continua 2019, 61(2), 739-757. https://doi.org/10.32604/cmc.2019.06097

Abstract

Spark is a distributed data processing framework based on memory. Memory allocation is a focus question of Spark research. A good memory allocation scheme can effectively improve the efficiency of task execution and memory resource utilization of the Spark. Aiming at the memory allocation problem in the Spark2.x version, this paper optimizes the memory allocation strategy by analyzing the Spark memory model, the existing cache replacement algorithms and the memory allocation methods, which is on the basis of minimizing the storage area and allocating the execution area according to the demand. It mainly including two parts: cache replacement optimization and memory allocation optimization. Firstly, in the storage area, the cache replacement algorithm is optimized according to the characteristics of RDD Partition, which is combined with PCA dimension. In this section, the four features of RDD Partition are selected. When the RDD cache is replaced, only two most important features are selected by PCA dimension reduction method each time, thereby ensuring the generalization of the cache replacement strategy. Secondly, the memory allocation strategy of the execution area is optimized according to the memory requirement of Task and the memory space of storage area. In this paper, a series of experiments in Spark on Yarn mode are carried out to verify the effectiveness of the optimization algorithm and improve the cluster performance.

Keywords

Memory calculation, memory allocation optimization, cache replacement optimization

Cite This Article

S. Wang, S. Geng, Z. Zhang, A. Ye, K. Chen et al., "A dynamic memory allocation optimization mechanism based on spark," Computers, Materials & Continua, vol. 61, no.2, pp. 739–757, 2019.



This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1143

    View

  • 1351

    Download

  • 0

    Like

Related articles

Share Link

WeChat scan