Insider Threat Detection Based on NLP Word Embedding and Machine Learning

Mohd Haq; Mohd Abdul; Mohammed Alshehri

doi:10.32604/iasc.2022.021430

Open Access icon Open Access

ARTICLE

Insider Threat Detection Based on NLP Word Embedding and Machine Learning

Mohd Anul Haq¹, Mohd Abdul Rahim Khan^1,*, Mohammed Alshehri²

1 Department of Computer Science, College of Computer and Information Sciences, Majmaah University, Al-Majmaah 11952, Saudi Arabia
2 Department of Information Technology, College of Computer and Information Sciences, Majmaah University, Al-Majmaah 11952, Saudi Arabia

* Corresponding Author: Mohd Abdul Rahim Khan. Email: email

(This article belongs to the Special Issue: Humans and Cyber Security Behaviour)

Intelligent Automation & Soft Computing 2022, 33(1), 619-635. https://doi.org/10.32604/iasc.2022.021430

Received 02 July 2021; Accepted 09 November 2021; Issue published 05 January 2022

Abstract

The growth of edge computing, the Internet of Things (IoT), and cloud computing have been accompanied by new security issues evolving in the information security infrastructure. Recent studies suggest that the cost of insider attacks is higher than the external threats, making it an essential aspect of information security for organizations. Efficient insider threat detection requires state-of-the-art Artificial Intelligence models and utility. Although significant have been made to detect insider threats for more than a decade, there are many limitations, including a lack of real data, low accuracy, and a relatively low false alarm, which are major concerns needing further investigation. In this paper, an attempt to fulfill these gaps by detecting insider threats with the novelties of the present investigation first developed two deep learning hybrid LSTM models integrated with Google's Word2vec LSTM (Long Short-Term Memory) GLoVe (Global Vectors for Word Representation) LSTM. Secondly, the performance of two hybrid DL models was compared with the state-of-the-art ML models such as XGBoost, AdaBoost, RF (Random Forest), KNN (K-Nearest Neighbor) and LR (Logistics Regression). Thirdly, the present investigation bridges the gaps of using a real dataset, high accuracy, and significantly lower false alarm rate. It was found that ML-based models outperformed the DL-based ones. The results were evaluated based on earlier studies and deemed efficient at detecting insider threats using the real dataset.

Keywords

Natural language processing; insider threats; lstm; word2vec; global vectors for word representation

Cite This Article

APA Style

Haq, M.A., Khan, M.A.R., Alshehri, M. (2022). Insider threat detection based on NLP word embedding and machine learning. Intelligent Automation & Soft Computing, 33(1), 619-635. https://doi.org/10.32604/iasc.2022.021430

Vancouver Style

Haq MA, Khan MAR, Alshehri M. Insider threat detection based on NLP word embedding and machine learning. Intell Automat Soft Comput . 2022;33(1):619-635 https://doi.org/10.32604/iasc.2022.021430

IEEE Style

M.A. Haq, M.A.R. Khan, and M. Alshehri "Insider Threat Detection Based on NLP Word Embedding and Machine Learning," Intell. Automat. Soft Comput. , vol. 33, no. 1, pp. 619-635. 2022. https://doi.org/10.32604/iasc.2022.021430

BibTex EndNote RIS

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Insider Threat Detection Based on NLP Word Embedding and Machine Learning

Abstract

Keywords

Cite This Article

1824

1263

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link