Intrusion detection system plays an important role in defending networks from security breaches. End-to-end machine learning-based intrusion detection systems are being used to achieve high detection accuracy. However, in case of adversarial attacks, that cause misclassification by introducing imperceptible perturbation on input samples, performance of machine learning-based intrusion detection systems is greatly affected. Though such problems have widely been discussed in image processing domain, very few studies have investigated network intrusion detection systems and proposed corresponding defence. In this paper, we attempt to fill this gap by using adversarial attacks on standard intrusion detection datasets and then using adversarial samples to train various machine learning algorithms (adversarial training) to test their defence performance. This is achieved by first creating adversarial sample based on Jacobian-based Saliency Map Attack (JSMA) and Fast Gradient Sign Attack (FGSM) using NSLKDD, UNSW-NB15 and CICIDS17 datasets. The study then trains and tests JSMA and FGSM based adversarial examples in seen (where model has been trained on adversarial samples) and unseen (where model is unaware of adversarial packets) attacks. The experiments includes multiple machine learning classifiers to evaluate their performance against adversarial attacks. The performance parameters include Accuracy, F1-Score and Area under the receiver operating characteristic curve (AUC) Score.
Machine learning models are currently being deployed in many domains [
Adversarial attacks can be classified into two types: White-box attacks and black-box attacks. In white-box attacks, an adversary has the knowledge of the trained model, training data, network architecture hyper parameters etc. Whereas, in a black-box attack, an adversary has no access to training data and training model. Thus an adversary acts as a normal user and only knows the output of the model (label or confidence score).
Security concerns in enterprise networks remains a major worry as cyber threats increase day by day [
Researchers have used Machine Learning (ML) in anomaly-based IDS with the hope of improving intrusion detection. The limitation of ML models concerning the security of the model itself has been explored in the literature. Researchers have focused on the image processing domain and investigated it thoroughly [
In this paper, the focus has been on adversarial defence. Multiple datasets are used for generation of adversarial attacks. The models trained using different ML datasets are then compared for performance. Models are also trained on adversarial attacks and their performance then analyzed. The paper is organized as follows: Section 2 discusses generation of adversarial attacks. Related literature is reviewed in Section 3. Experimental setup is discussed in Section 4 whereas results are discussed in Section 5. We conclude in Section 6.
Multi-layer Perceptron (MLP) [
In the MLP network, each perceptron receives
The result of this computation is then passed onto an activation function
Jacobian-based Saliency Map Attack (JSMA) was proposed in 2016 [
For the white-box attack category, JSMA is more suitable for an adversary [
Fast Gradient Sign Attack (FGSM) was first proposed in 2014 [
The FGSM based adversarial attack is formulated as given in
FGSM attack was initially evaluated on image related datasets like ImageNet [
The following Section details adversarial machine learning training and attack related work done by others.
The authors in [
The authors in [
In [
The authors in [
In [
The authors in [
In [
The authors in [
In [
The authors in [
In [
The authors in [
In [
Detail comparison of adversarial attack on three benchmark datasets i.e., NSLKDD, UNSW-NB15 and CICIDS17 For a defence against adversarial attacks, adversarial training is used by including adversarial dataset generated through FGSM and JSMA in training process. Various machine learning algorithms have been tested against adversarial attacks in seen (where model is aware of adversarial samples) and unseen (where model is unaware of adversarial samples) attacks.
Paper | Dataset | Classifier | Attack method | Defence | Attack type | Performance parameter | limitation |
---|---|---|---|---|---|---|---|
2017 [ |
NSLKDD | DT, SVM, RF, Voting | FGSM, JSMA | None | White box | Accuracy, F1-Score and AUC normal | Single dataset, results on JSMA only, No defence |
2018 [ |
NSLKDD | MLP | JSMA, FGSM, Deepfool, C&W | None | White box | Confusion matrix, Accuracy, F1-Score, False alarm | Single dataset, No defence, One classifier |
2018 [ |
NSLKDD | SVM, NB, MLP, LR, DT, RF, K-NN. | GAN | None | Black box | Detection Rate, Evasion increase rate | Single dataset, No defence |
2019 [ |
KDDCup 99 | CNN | DoS-WGAN | None | Black box | Accuracy | Single dataset, No defence |
2019 [ |
NSLKDD | Naïve Bayes, RF, SVM, Proposed | C&W, ZOO, GAN | None | Black box | Accuracy, Precision, Recall, False alarm, F1-score | Single dataset, No defence |
2018 [ |
NSLKDD | Neural Network | FGSM | None | White box | Confusion matrix, Accuracy, Precision | Single dataset, No defence, One classifier |
2019 [ |
NSLKDD | DNN, SVM, RF, LR | PGD, MI-FSGM, L-BFGS, SPSA | None | White box | Accuracy, Precision Rate, Recall Rate, F1-Score, Success Rate | Single dataset, No defence |
2020 [ |
NSLKDD, UNSW-NB15 | SVM, DT, NB, K-NN, RF, MLP, GB, LR, LDA, QDA, BAG | PSO, GA, GAN | None | White box | Evasion Rate | No defence |
2020 [ |
CICIDS17 | ANN, RF, ADABoost, SVM | FGSM, BIM, C&W, PGD | Adversarial training | White box | Accuracy, Precision, Recall, F1-Score | Single dataset, Only Seen attacks, Adversarial labelled record in adversarial training |
2019 [ |
Botnet [ |
RF, MLP and K-NN | Manually crafted | Adversarial training, Feature removal | White Box | Accuracy, Precision, Recall, F1-Score | Single dataset |
2019 [ |
NSLKDD, CICIDS17 | DT, RF, NB, SVM, NN, D A | FGSM, JSMA, Deepfool, C&W | Adversarial training | White Box | AUC | Only AUC was observed. Unknow attack in AT |
2019 [ |
KDDCUP 99 | DNN, RF, LR, NB, DT, K-NN, SVM, GB | GAN | Adversarial training | Black Box | Accuracy, Precision, Recall and F1-score | Single dataset |
2021 [ |
CSE-CIC-IDS2018 | Neural Network | MAT |
Adversarial training includes MAT and MGAN | Black Box | Accuracy, Precision, Recall and F1-score | Adversarial examples generated and tested on Neural network |
2017 [ |
DERBIN [ |
Neural Network | Manually crafted | Adversarial training, Defence distillation | White box | False negative rates, misclassification rate, Average |
Used static features. Single dataset and neural network |
In this study, NSLKDD, UNSW-NB15 and CICIDS17 datasets are utilized. For NSLKDD, there are 39 types of attacks and one normal class. All the attacks have been converted into one of the four classes [‘dos’, ‘r2l’, ‘probe’ and ‘u2r’]. For UNSW-NB15, there are 9 attack types and one normal class. For CICIDS17, there are 14 attack types and one normal class. All the datasets are evaluated as multi-classification problem. The categorical data were One-Hot Encoded as 1 for correct and 0 for all others. StandardScaler is used to resize the distribution of data so that the mean of the observed data is 0 and the standard deviation is 1. StandardScaler standardizes a feature by subtracting the mean and then scaling to unit variance. The unit variance uses the standard deviation as the scaling factor.
The Sklearn library is used for classification and the Cleverhans library [
The adversarial examples were tested against multiple machine learning classifiers like Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, Logistic Regression and Naïve Bayes.
To evaluate the performance of machine learning classifiers, Accuracy, F1 Score and AUC Score are used
AUC is the area under the Receiver Operating Characteristic (ROC) curve which is drawn using the false positive rate (FPR) and true positive rate (TPR) metrics.
The experiments are divided into the following five types:
The baseline performance of each classifier Classifiers tested against JSMA AND FGSM based adversarial attacks (without adversarial training) Classifiers trained with adversarial samples and tested with the original dataset Performance of classifiers tested against JSMA AND FGSM after adversarial training with JSMA. Performance of classifiers tested against JSMA AND FGSM after adversarial training with FGSM.
For the evaluation of experiments (i) and (ii), the model in
NSLKDD | Original dataset (i) | Test on adv. dataset (JSMA) (ii-a) | Adv. training (JSMA) test on original dataset (iii-a) | (JSMA) Adv. training and adv. test dataset (iv-a) | Tested on unseen attack (FGSM) after adv. training (JSMA) (iv-b) | Test on adv. dataset (FGSM) (ii-b) | Adv. training (FGSM) test on original dataset (iii-b) | (FGSM) Adv. training and adv. test dataset (v-a) | Tested on unseen attack (JSMA) after adv. training (FGSM)(v-b) | |
---|---|---|---|---|---|---|---|---|---|---|
1 | DT | 0.448 | 0.790 | 0.143 | 0.509 | 0.826 | 0.466 | |||
2 | RF | 0.797 | 0.448 | 0.999 | 0.342 | 0.543 | 0.804 | 0.997 | ||
3 | SVM | 0.766 | 0.697 | 0.833 | 0.155 | 0.196 | 0.836 | 0.912 | 0.536 | |
4 | K-NN | 0.778 | 0.578 | 0.778 | 0.996 | 0.861 | 0.980 | 0.570 | ||
5 | LR | 0.805 | 0.315 | 0.787 | 0.974 | 0.145 | 0.261 | 0.946 | ||
6 | NB | 0.560 | ||||||||
1 | DT | 0.569 | 0.128 | 0.567 | 0.112 | 0.294 | 0.653 | 0.168 | ||
2 | RF | 0.534 | 0.123 | 0.532 | 0.999 | 0.196 | 0.313 | 0.577 | ||
3 | SVM | 0.502 | 0.240 | 0.499 | 0.082 | 0.198 | 0.715 | 0.209 | ||
4 | K-NN | 0.546 | 0.545 | 0.958 | 0.732 | 0.931 | 0.242 | |||
5 | LR | 0.096 | 0.849 | 0.138 | 0.323 | 0.881 | 0.252 | |||
6 | NB | 0.437 | 0.614 | |||||||
1 | DT | 0.742 | 0.503 | 0.605 | 0.531 | |||||
2 | RF | 0.857 | 0.999 | 0.584 | 0.770 | 0.893 | 0.609 | |||
3 | SVM | 0.903 | 0.839 | 0.993 | 0.461 | 0.580 | 0.990 | 0.994 | 0.538 | |
4 | K-NN | 0.797 | 0.563 | 0.792 | 0.999 | 0.866 | 0.563 | |||
5 | LR | 0.490 | 0.862 | 0.995 | 0.555 | 0.659 | 0.992 | 0.994 | ||
6 | NB | 0.496 | 0.777 | 0.499 |
UNSW-NB15 | Original dataset (i) | Test on adv. dataset (JSMA) (ii-a) | Adv. training (JSMA) test on original dataset (iii-a) | (JSMA) Adv. training and adv. test dataset (iv-a) | Tested on unseen attack (FGSM) after adv. training (JSMA) (iv-b) | Test on adv. dataset (FGSM) (ii-b) | Adv. training (FGSM) test on original dataset (iii-b) | (FGSM) Adv. training and adv. test dataset (v-a) | Tested on unseen attack (JSMA) after adv. training (FGSM)(v-b) | |
---|---|---|---|---|---|---|---|---|---|---|
1 | DT | 0.730 | 0.730 | 0.388 | 0.739 | 0.449 | ||||
2 | RF | 0.135 | 0.933 | 0.336 | 0.335 | 0.933 | 0.449 | |||
3 | SVM | 0.620 | 0.507 | 0.645 | 0.830 | 0.384 | 0.277 | 0.635 | 0.707 | 0.526 |
4 | K-NN | 0.663 | 0.083 | 0.662 | 0.883 | 0.665 | 0.828 | 0.068 | ||
5 | LR | 0.634 | 0.636 | 0.835 | 0.455 | 0.380 | 0.674 | 0.738 | ||
6 | NB | 0.449 | 0.449 | |||||||
1 | DT | 0.471 | 0.156 | 0.108 | 0.063 | |||||
2 | RF | 0.023 | 0.475 | 0.753 | 0.138 | 0.134 | 0.451 | 0.751 | 0.060 | |
3 | SVM | 0.290 | 0.115 | 0.297 | 0.390 | 0.134 | 0.120 | 0.298 | 0.326 | |
4 | K-NN | 0.376 | 0.051 | 0.374 | 0.494 | 0.379 | 0.456 | |||
5 | LR | 0.300 | 0.299 | 0.399 | 0.198 | 0.171 | 0.319 | 0.341 | 0.119 | |
6 | NB | 0.062 | ||||||||
1 | DT | 0.817 | 0.500 | 0.825 | 0.552 | 0.528 | 0.809 | 0.501 | ||
2 | RF | 0.991 | 0.584 | 0.545 | 0.913 | 0.991 | ||||
3 | SVM | 0.895 | 0.488 | 0.888 | 0.925 | 0.614 | 0.581 | 0.914 | 0.911 | 0.531 |
4 | K-NN | 0.799 | 0.476 | 0.801 | 0.966 | 0.686 | 0.962 | 0.458 | ||
5 | LR | 0.882 | 0.883 | 0.934 | 0.676 | 0.922 | 0.515 | |||
6 | NB | 0.500 | 0.817 |
CICIDS17 | Original dataset (i) | Test on adv. dataset (JSMA) (ii-a) | Adv. training (JSMA) test on original dataset (iii-a) | (JSMA) Adv. training and adv. test dataset (iv-a) | Tested on unseen attack (FGSM) after adv. training (JSMA) (iv-b) | Test on adv. dataset (FGSM) (ii-b) | Adv. training (FGSM) test on original dataset (iii-b) | (FGSM) Adv. training and adv. test dataset (v-a) | Tested on unseen attack (JSMA) after adv. training (FGSM)(v-b) | |
---|---|---|---|---|---|---|---|---|---|---|
1 | DT | 0.843 | 0.710 | 0.584 | 0.846 | |||||
2 | RF | 0.829 | 0.817 | |||||||
3 | SVM | 0.803 | 0.803 | 0.804 | 0.803 | 0.751 | 0.705 | 0.803 | 0.803 | 0.803 |
4 | K-NN | 0.993 | 0.844 | 0.994 | 0.857 | 0.994 | 0.993 | 0.844 | ||
5 | LR | 0.967 | 0.831 | 0.955 | 0.841 | 0.960 | 0.975 | 0.836 | ||
6 | NB | 0.802 | 0.803 | |||||||
1 | DT | 0.247 | 0.127 | 0.142 | 0.284 | |||||
2 | RF | 0.834 | 0.886 | 0.425 | 0.103 | 0.086 | 0.845 | 0.973 | ||
3 | SVM | 0.062 | ||||||||
4 | K-NN | 0.759 | 0.230 | 0.759 | 0.327 | 0.767 | 0.768 | 0.230 | ||
5 | LR | 0.356 | 0.172 | 0.388 | 0.193 | 0.071 | 0.063 | 0.420 | 0.544 | 0.179 |
6 | NB | 0.457 | 0.149 | 0.140 | 0.077 | 0.059 | 0.257 | 0.395 | 0.105 | |
1 | DT | 0.564 | 0.949 | 0.534 | 0.530 | 0.575 | ||||
2 | RF | 0.733 | 0.653 | 0.595 | 0.617 | |||||
3 | SVM | 0.966 | 0.587 | 0.969 | 0.627 | 0.672 | 0.989 | 0.993 | 0.619 | |
4 | K-NN | 0.971 | 0.568 | 0.971 | 0.644 | 0.695 | 0.975 | |||
5 | LR | 0.952 | 0.633 | 0.679 | 0.661 | 0.973 | 0.974 | |||
6 | NB | 0.971 | 0.942 | 0.669 | 0.942 | 0.645 |
The baseline performance obtained in (i) drops after the classifiers are tested against JSMA and FGSM based adversarial attacks in (ii-a) and (ii-b) (without adversarial training) on original datasets. The classifiers trained with adversarial samples and tested with the original dataset in (iii-a) and (iii-b) shows a similar trend as observed in (i). Performance of classifiers after adversarial training in (iv-a) and (v-a) shows improved results for the seen attack, whereas in (iv-b) and (v-b) drop of accuracy can be observed even after adversarial training for the unseen attack.
Referring to experiment (ii) the impact of adversarial attack either with JSMA or FGSM without any adversarial training indicate the average drop of accuracy of around 25% to 30% on all the classifiers and datasets. On the other hand, K-NN shows better performance in accuracy, F1-Score and AUC among all the classifiers when tested against FGSM. Similarly, Random Forest performs better for CICIDS17 dataset when tested against JSMA.
The experiment types (iv-a) and (iv-b) is where models are trained on JSMA and tested on the JSMA and FGSM attacks respectively. The results in this type of experiment are better than type (ii) for all the classifiers as these classifiers are trained on the adversarial examples on which they have been tested. The experiment (iv-a) shows better results for the Decision Tree for all the datasets. Similarly, for the experiment (iv-b), K-NN has performed better against FGSM compared to other classifiers trained on JSMA based adversarial examples for all datasets. Whereas in the experiment (v-b), classifiers trained on FGSM and tested against JSMA based adversarial examples, Logistic Regression performs well in accuracy for NSLKDD and UNSW-NB15 datasets. For CICIDS17, Random Forest performs is better in accuracy among all classifiers. Considering accuracy for all the datasets, Naïve Bayes performs worst among all classifiers with the exception of a few results.
Analyzing the AUC Score, for experiment type (iv-a) and (v-a), observed results lies above 90% except for the CICIDS17 dataset in type (iv-a). Another interesting aspect for the experiment type (iv-b) for NSLKDD is, obtained results are not up to the mark for accuracy. A similar pattern is reflected by the F1 and AUC Score which results in a low score. Whereas the results of (v-a) are good for each classifier and are also supported by the F1 and AUC scores. These kinds of patterns validate our work according to the performance parameters included in our studies.
We have tested FGSM and JSMA based adversarial examples against multiple classifiers. The experiments have been conducted in five different scenarios. Initially, classifiers have been tested on clean data to compare the results with other experiments. The classifies tested against adversarial examples with or without adversarial training. The behaviour of classifiers on multiple datasets has been observed with performance parameters. Performance of NB is observed to be the worst overall whereas KNN performs better in NSLKDD and UNSW-NB15 datasets. For CICIDS17, Random Forest classifier gives better results.
The future work includes the adversarial training of the classifier with multiple adversarial datasets to increase the robustness of the classifier. Ensemble of classifiers can also be created to increase overall performance against adversarial attacks.