    A Hybrid Deep Learning-Based Unsupervised Anomaly Detection in High Dimensional Data

    Amgad Muneer1,2,*, Shakirah Mohd Taib1,2, Suliman Mohamed Fati3, Abdullateef O. Balogun1, Izzatdin Abdul Aziz1,2

    CMC-Computers, Materials & Continua, Vol.70, No.3, pp. 5363-5381, 2022, DOI:10.32604/cmc.2022.021113

    Abstract Anomaly detection in high dimensional data is a critical research issue with serious implication in the real-world problems. Many issues in this field still unsolved, so several modern anomaly detection methods struggle to maintain adequate accuracy due to the highly descriptive nature of big data. Such a phenomenon is referred to as the “curse of dimensionality” that affects traditional techniques in terms of both accuracy and performance. Thus, this research proposed a hybrid model based on Deep Autoencoder Neural Network (DANN) with five layers to reduce the difference between the input and output. The proposed… More >

  • Open Access


    A Tradeoff Between Accuracy and Speed for K-Means Seed Determination

    Farzaneh Khorasani1, Morteza Mohammadi Zanjireh1,*, Mahdi Bahaghighat1, Qin Xin2

    Computer Systems Science and Engineering, Vol.40, No.3, pp. 1085-1098, 2022, DOI:10.32604/csse.2022.016003

    Abstract With a sharp increase in the information volume, analyzing and retrieving this vast data volume is much more essential than ever. One of the main techniques that would be beneficial in this regard is called the Clustering method. Clustering aims to classify objects so that all objects within a cluster have similar features while other objects in different clusters are as distinct as possible. One of the most widely used clustering algorithms with the well and approved performance in different applications is the k-means algorithm. The main problem of the k-means algorithm is its performance… More >

  • Open Access


    Cluster Analysis for IR and NIR Spectroscopy: Current Practices to Future Perspectives

    Simon Crase1,2, Benjamin Hall2, Suresh N. Thennadil3,*

    CMC-Computers, Materials & Continua, Vol.69, No.2, pp. 1945-1965, 2021, DOI:10.32604/cmc.2021.018517

    Abstract Supervised machine learning techniques have become well established in the study of spectroscopy data. However, the unsupervised learning technique of cluster analysis hasn’t reached the same level maturity in chemometric analysis. This paper surveys recent studies which apply cluster analysis to NIR and IR spectroscopy data. In addition, we summarize the current practices in cluster analysis of spectroscopy and contrast these with cluster analysis literature from the machine learning and pattern recognition domain. This includes practices in data pre-processing, feature extraction, clustering distance metrics, clustering algorithms and validation techniques. Special consideration is given to the More >

  • Open Access


    Cloud Based Monitoring and Diagnosis of Gas Turbine Generator Based on Unsupervised Learning

    Xian Ma1, Tingyan Lv2,*, Yingqiang Jin2, Rongmin Chen2, Dengxian Dong2, Yingtao Jia2

    Energy Engineering, Vol.118, No.3, pp. 691-705, 2021, DOI:10.32604/EE.2021.012701

    Abstract The large number of gas turbines in large power companies is difficult to manage. A large amount of the data from the generating units is not mined and utilized for fault analysis. This study focuses on F-class (9F.05) gas turbine generators and uses unsupervised learning and cloud computing technologies to analyse the faults for the gas turbines. Remote monitoring of the operational status are conducted. The study proposes a cloud computing service architecture for large gas turbine objects, which uses unsupervised learning models to monitor the operational state of the gas turbine. Faults such as More >

  • Open Access


    A Novel Cardholder Behavior Model for Detecting Credit Card Fraud

    Yiğit Kültür, Mehmet Ufuk Çağlayan

    Intelligent Automation & Soft Computing, Vol.24, No.4, pp. 807-817, 2018, DOI:10.1080/10798587.2017.1342415

    Abstract Because credit card fraud costs the banking sector billions of dollars every year, decreasing the losses incurred from credit card fraud is an important driver for the sector and end-users. In this paper, we focus on analyzing cardholder spending behavior and propose a novel cardholder behavior model for detecting credit card fraud. The model is called the Cardholder Behavior Model (CBM). Two focus points are proposed and evaluated for CBMs. The first focus point is building the behavior model using single-card transactions versus multi-card transactions. As the second focus point, we introduce holiday seasons as More >

  • Open Access


    Analysis of Semi-Supervised Text Clustering Algorithm on Marine Data

    Yu Jiang1, 2, Dengwen Yu1, Mingzhao Zhao1, 2, Hongtao Bai1, 2, Chong Wang1, 2, 3, Lili He1, 2, *

    CMC-Computers, Materials & Continua, Vol.64, No.1, pp. 207-216, 2020, DOI:10.32604/cmc.2020.09861

    Abstract Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning. This paper implements and compares unsupervised and semi-supervised clustering analysis of BOAArgo ocean text data. Unsupervised K-Means and Affinity Propagation (AP) are two classical clustering algorithms. The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range. Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition, and use this data for semi-supervised… More >

  • Open Access


    Unsupervised Anomaly Detection via DBSCAN for KPIs Jitters in Network Managements

    Haiwen Chen1, Guang Yu1, Fang Liu2, Zhiping Cai1, *, Anfeng Liu3, Shuhui Chen1, Hongbin Huang1, Chak Fong Cheang4

    CMC-Computers, Materials & Continua, Vol.62, No.2, pp. 917-927, 2020, DOI:10.32604/cmc.2020.05981

    Abstract For many Internet companies, a huge amount of KPIs (e.g., server CPU usage, network usage, business monitoring data) will be generated every day. How to closely monitor various KPIs, and then quickly and accurately detect anomalies in such huge data for troubleshooting and recovering business is a great challenge, especially for unlabeled data. The generated KPIs can be detected by supervised learning with labeled data, but the current problem is that most KPIs are unlabeled. That is a time-consuming and laborious work to label anomaly for company engineers. Build an unsupervised model to detect unlabeled More >

