A Parallel Approach to Discords Discovery in Massive Time Series Data

Mikhail Zymbler; Alexander Grents; Yana Kraeva; Sachin Kumar

doi:10.32604/cmc.2020.014232

Open Access icon Open Access

ARTICLE

A Parallel Approach to Discords Discovery in Massive Time Series Data

Mikhail Zymbler^*, Alexander Grents, Yana Kraeva, Sachin Kumar

Department of Computer Science, South Ural State University, Chelyabinsk, 454080, Russian

* Corresponding Author: Mikhail Zymbler. Email: email

Computers, Materials & Continua 2021, 66(2), 1867-1878. https://doi.org/10.32604/cmc.2020.014232

Received 08 September 2020; Accepted 30 September 2020; Issue published 26 November 2020

Abstract

A discord is a refinement of the concept of an anomalous subsequence of a time series. Being one of the topical issues of time series mining, discords discovery is applied in a wide range of real-world areas (medicine, astronomy, economics, climate modeling, predictive maintenance, energy consumption, etc.). In this article, we propose a novel parallel algorithm for discords discovery on high-performance cluster with nodes based on many-core accelerators in the case when time series cannot fit in the main memory. We assumed that the time series is partitioned across the cluster nodes and achieved parallelization among the cluster nodes as well as within a single node. Within a cluster node, the algorithm employs a set of matrix data structures to store and index the subsequences of a time series, and to provide an efficient vectorization of computations on the accelerator. At each node, the algorithm processes its own partition and performs in two phases, namely candidate selection and discord refinement, with each phase requiring one linear scan through the partition. Then the local discords found are combined into the global candidate set and transmitted to each cluster node. Next, a node performs refinement of the global candidate set over its own partition resulting in the local true discord set. Finally, the global true discords set is constructed as intersection of the local true discord sets. The experimental evaluation on the real computer cluster with real and synthetic time series shows a high scalability of the proposed algorithm.

Keywords

Time series; discords discovery; computer cluster; many-core accelerator; vectorization

Cite This Article

APA Style

Zymbler, M., Grents, A., Kraeva, Y., Kumar, S. (2021). A Parallel Approach to Discords Discovery in Massive Time Series Data. Computers, Materials & Continua, 66(2), 1867–1878. https://doi.org/10.32604/cmc.2020.014232

Vancouver Style

Zymbler M, Grents A, Kraeva Y, Kumar S. A Parallel Approach to Discords Discovery in Massive Time Series Data. Comput Mater Contin. 2021;66(2):1867–1878. https://doi.org/10.32604/cmc.2020.014232

IEEE Style

M. Zymbler, A. Grents, Y. Kraeva, and S. Kumar, “A Parallel Approach to Discords Discovery in Massive Time Series Data,” Comput. Mater. Contin., vol. 66, no. 2, pp. 1867–1878, 2021. https://doi.org/10.32604/cmc.2020.014232

BibTex EndNote RIS

Citations

1

[click to view]

Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

A Parallel Approach to Discords Discovery in Massive Time Series Data

Abstract

Keywords

Cite This Article

Citations

4195

2260

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link