Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (16)
  • Open Access

    ARTICLE

    SMConf: One-Size-Fit-Bunch, Automated Memory Capacity Configuration for In-Memory Data Analytic Platform

    Yi Liang1,*, Shaokang Zeng1, Xiaoxian Xu2, Shilu Chang1, Xing Su1

    CMC-Computers, Materials & Continua, Vol.66, No.2, pp. 1697-1717, 2021, DOI:10.32604/cmc.2020.012513

    Abstract Spark is the most popular in-memory processing framework for big data analytics. Memory is the crucial resource for workloads to achieve performance acceleration on Spark. The extant memory capacity configuration approach in Spark is to statically configure the memory capacity for workloads based on user’s specifications. However, without the deep knowledge of the workload’s system-level characteristics, users in practice often conservatively overestimate the memory utilizations of their workloads and require resource manager to grant more memory share than that they actually need, which leads to the severe waste of memory resources. To address the above issue, SMConf, an automated memory… More >

  • Open Access

    ARTICLE

    Case Study: Spark GPU-Enabled Framework to Control COVID-19 Spread Using Cell-Phone Spatio-Temporal Data

    Hussein Shahata Abdallah1, *, Mohamed H. Khafagy1, Fatma A. Omara2

    CMC-Computers, Materials & Continua, Vol.65, No.2, pp. 1303-1320, 2020, DOI:10.32604/cmc.2020.011313

    Abstract Nowadays, the world is fighting a dangerous form of Coronavirus that represents an emerging pandemic. Since its early appearance in China Wuhan city, many countries undertook several strict regulations including lockdowns and social distancing measures. Unfortunately, these procedures have badly impacted the world economy. Detecting and isolating positive/probable virus infected cases using a tree tracking mechanism constitutes a backbone for containing and resisting such fast spreading disease. For helping this hard effort, this research presents an innovative case study based on big data processing techniques to build a complete tracking system able to identify the central areas of infected/suspected people,… More >

  • Open Access

    ARTICLE

    News Text Topic Clustering Optimized Method Based on TF-IDF Algorithm on Spark

    Zhuo Zhou1, Jiaohua Qin1,*, Xuyu Xiang1, Yun Tan1, Qiang Liu1, Neal N. Xiong2

    CMC-Computers, Materials & Continua, Vol.62, No.1, pp. 217-231, 2020, DOI:10.32604/cmc.2020.06431

    Abstract Due to the slow processing speed of text topic clustering in stand-alone architecture under the background of big data, this paper takes news text as the research object and proposes LDA text topic clustering algorithm based on Spark big data platform. Since the TF-IDF (term frequency-inverse document frequency) algorithm under Spark is irreversible to word mapping, the mapped words indexes cannot be traced back to the original words. In this paper, an optimized method is proposed that TF-IDF under Spark to ensure the text words can be restored. Firstly, the text feature is extracted by the TF-IDF algorithm combined CountVectorizer… More >

  • Open Access

    ARTICLE

    A Dynamic Memory Allocation Optimization Mechanism Based on Spark

    Suzhen Wang1, Shanshan Geng1, Zhanfeng Zhang1, Anshan Ye2, Keming Chen2, Zhaosheng Xu2, Huimin Luo2, Gangshan Wu3,*, Lina Xu4, Ning Cao5

    CMC-Computers, Materials & Continua, Vol.61, No.2, pp. 739-757, 2019, DOI:10.32604/cmc.2019.06097

    Abstract Spark is a distributed data processing framework based on memory. Memory allocation is a focus question of Spark research. A good memory allocation scheme can effectively improve the efficiency of task execution and memory resource utilization of the Spark. Aiming at the memory allocation problem in the Spark2.x version, this paper optimizes the memory allocation strategy by analyzing the Spark memory model, the existing cache replacement algorithms and the memory allocation methods, which is on the basis of minimizing the storage area and allocating the execution area according to the demand. It mainly including two parts: cache replacement optimization and… More >

  • Open Access

    ARTICLE

    An Improved Memory Cache Management Study Based on Spark

    Suzhen Wang1, Yanpiao Zhang1, Lu Zhang1, Ning Cao2, *, Chaoyi Pang3

    CMC-Computers, Materials & Continua, Vol.56, No.3, pp. 415-431, 2018, DOI: 10.3970/cmc.2018.03716

    Abstract Spark is a fast unified analysis engine for big data and machine learning, in which the memory is a crucial resource. Resilient Distribution Datasets (RDDs) are parallel data structures that allow users explicitly persist intermediate results in memory or on disk, and each one can be divided into several partitions. During task execution, Spark automatically monitors cache usage on each node. And when there is a RDD that needs to be stored in the cache where the space is insufficient, the system would drop out old data partitions in a least recently used (LRU) fashion to release more space. However,… More >

  • Open Access

    ARTICLE

    A Spark Scheduling Strategy for Heterogeneous Cluster

    Xuewen Zhang1, Zhonghao Li1, Gongshen Liu1,*, Jiajun Xu1, Tiankai Xie2, Jan Pan Nees1

    CMC-Computers, Materials & Continua, Vol.55, No.3, pp. 405-417, 2018, DOI: 10.3970/cmc.2018.02527

    Abstract As a main distributed computing system, Spark has been used to solve problems with more and more complex tasks. However, the native scheduling strategy of Spark assumes it works on a homogenized cluster, which is not so effective when it comes to heterogeneous cluster. The aim of this study is looking for a more effective strategy to schedule tasks and adding it to the source code of Spark. After investigating Spark scheduling principles and mechanisms, we developed a stratifying algorithm and a node scheduling algorithm is proposed in this paper to optimize the native scheduling strategy of Spark. In this… More >

Displaying 11-20 on page 2 of 16. Per Page