Open Access iconOpen Access

ARTICLE

crossmark

SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform

Yi Liang1,*, Shaokang Zeng1, Xiaoxian Xu2, Shilu Chang1, Xing Su1

1 Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
2 Department of Informatics, University of Zurich, Zurich, CH-8050, Switzerland

* Corresponding Author: Yi Liang. Email: email

Computers, Materials & Continua 2021, 66(2), 1697-1717. https://doi.org/10.32604/cmc.2020.012513

Abstract

Spark is the most popular in-memory processing framework for big data analytics. Memory is the crucial resource for workloads to achieve performance acceleration on Spark. The extant memory capacity configuration approach in Spark is to statically configure the memory capacity for workloads based on user’s specifications. However, without the deep knowledge of the workload’s system-level characteristics, users in practice often conservatively overestimate the memory utilizations of their workloads and require resource manager to grant more memory share than that they actually need, which leads to the severe waste of memory resources. To address the above issue, SMConf, an automated memory capacity configuration solution for in-memory computing workloads in Spark is proposed. SMConf is designed based on the observation that, though there is not one-size-fit-all proper configuration, the one-size-fit-bunch configuration can be found for in-memory computing workloads. SMConf classifies typical Spark workloads into categories based on metrics across layers of Spark system stack. For each workload category, an individual memory requirement model is learned from the workload’s input data size and the strong-correlated configuration parameters. For an ad-hoc workload, SMConf matches its memory requirement signature to one of the workload categories with small-sized input data and determines its proper memory capacity configuration with the corresponding memory requirement model. Experimental results demonstrate that, compared to the conservative default configuration, SMConf can reduce the memory resource provision to Spark workloads by up to 69% with the slight performance degradation, and reduce the average turnaround time of Spark workloads by up to 55% in the multi-tenant environments.

Keywords


Cite This Article

Y. Liang, S. Zeng, X. Xu, S. Chang and X. Su, "Smconf: one-size-fit-bunch, automated memory capacity configuration for in-memory data analytic platform," Computers, Materials & Continua, vol. 66, no.2, pp. 1697–1717, 2021. https://doi.org/10.32604/cmc.2020.012513



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1597

    View

  • 1147

    Download

  • 0

    Like

Share Link