SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform

Yi Liang; Shaokang Zeng; Xiaoxian Xu; Shilu Chang; Xing Su

doi:10.32604/cmc.2020.012513

Open Access icon Open Access

ARTICLE

SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform

Yi Liang^1,*, Shaokang Zeng¹, Xiaoxian Xu², Shilu Chang¹, Xing Su¹

1 Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
2 Department of Informatics, University of Zurich, Zurich, CH-8050, Switzerland

* Corresponding Author: Yi Liang. Email: email

Computers, Materials & Continua 2021, 66(2), 1697-1717. https://doi.org/10.32604/cmc.2020.012513

Received 02 July 2020; Accepted 31 July 2020; Issue published 26 November 2020

Abstract

Spark is the most popular in-memory processing framework for big data analytics. Memory is the crucial resource for workloads to achieve performance acceleration on Spark. The extant memory capacity configuration approach in Spark is to statically configure the memory capacity for workloads based on user’s specifications. However, without the deep knowledge of the workload’s system-level characteristics, users in practice often conservatively overestimate the memory utilizations of their workloads and require resource manager to grant more memory share than that they actually need, which leads to the severe waste of memory resources. To address the above issue, SMConf, an automated memory capacity configuration solution for in-memory computing workloads in Spark is proposed. SMConf is designed based on the observation that, though there is not one-size-fit-all proper configuration, the one-size-fit-bunch configuration can be found for in-memory computing workloads. SMConf classifies typical Spark workloads into categories based on metrics across layers of Spark system stack. For each workload category, an individual memory requirement model is learned from the workload’s input data size and the strong-correlated configuration parameters. For an ad-hoc workload, SMConf matches its memory requirement signature to one of the workload categories with small-sized input data and determines its proper memory capacity configuration with the corresponding memory requirement model. Experimental results demonstrate that, compared to the conservative default configuration, SMConf can reduce the memory resource provision to Spark workloads by up to 69% with the slight performance degradation, and reduce the average turnaround time of Spark workloads by up to 55% in the multi-tenant environments.

Keywords

Spark; memory capacity; automated configuration

Cite This Article

APA Style

Liang, Y., Zeng, S., Xu, X., Chang, S., Su, X. (2021). SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform. Computers, Materials & Continua, 66(2), 1697–1717. https://doi.org/10.32604/cmc.2020.012513

Vancouver Style

Liang Y, Zeng S, Xu X, Chang S, Su X. SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform. Comput Mater Contin. 2021;66(2):1697–1717. https://doi.org/10.32604/cmc.2020.012513

IEEE Style

Y. Liang, S. Zeng, X. Xu, S. Chang, and X. Su, “SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform,” Comput. Mater. Contin., vol. 66, no. 2, pp. 1697–1717, 2021. https://doi.org/10.32604/cmc.2020.012513

BibTex EndNote RIS

Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform

Abstract

Keywords

Cite This Article

2427

1768

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link