Machine Learning Approach for Improvement in Kitsune NID

Network intrusion detection is the pressing need of every communication network. Many network intrusion detection systems (NIDS) have been proposed in the literature to cater to this need. In recent literature, plug-and-play NIDS, Kitsune, was proposed in 2018 and greatly appreciated in the literature. The Kitsune datasets were divided into 70% training set and 30% testing set for machine learning algorithms. Our previous study referred that the variants of the Tree algorithms such as Simple Tree, Medium Tree, Coarse Tree, RUS Boosted, and Bagged Tree have reported similar effectiveness but with slight variation inefficiency. To further extend this investigation, we have explored the performance of variants of above said Tree algorithms on other datasets provided by Kitsune, such as Active Wiretap, ARP MitM, Fuzzing, OS Scan, SSDP Flood, SYN DoS, SSL renegotiation, Mirai, and Video Injection. This investigation ascertains the likely performance of above said tree algorithm variants. After a deep and rigorous analysis, the Fine Tree is highly recommended for the improved version of the Kitsune Tool.

This study first presents a comprehensive view of the existing work on applying machine learning and deep learning algorithm on Kitsune as a literature survey. Second, we have bridged the research gap and present a parametric empirical comparison of machine learning algorithms to find the best candidate of a machine learning algorithm for Kitsune as simulation results. Finally, we have introduced a rationale for opting for the best machine learning algorithm for the improved version of Kitsune analysis.

Literature Review
This section presents a comprehensive literature review on the recent work on the Kitsune dataset, specifically applying machine learning and deep learning algorithm on it. In late 2018, Mirsky et al. has contributed a robust plug-and-play NIDS named Kitsune. The Kitsune can detect a large variety of network attacks without substantial supervision. The principal algorithm of Kitsune is KitNET, with an ensemble of artificial neural networks. This arrangement helps to detect the traces of abnormal traffic patterns from the burst of legitimate network traffic. The authors in this research also have presented a benchmark dataset for NIDS. This dataset comprises Active Wiretap, ARP MitM, Fuzzing, OS Scan, SSDP Flood, SYN DoS, SSL renegotiation, Mirai, and Video Injection. The dataset is massive in volume and rich in contents [20]. Fig. 1 illustrates the system overview of Kitsune, where first the network packet is fetched by packet capturer. After capturing, the packet is parsed into the units. These units are fed for feature extraction and mapping. Finally, the packet is labeled as Benign/Malicious. The dataset generated from this model is also facilities to train the machine learning and deep learning models.
Peng et al. [25] in their study discussed that the classical Signature-based Network Intrusion Detection Systems are found to be deficient in handling the new disjoint network threats. Specifically, the thread with unknown signatures is significantly less susceptible to detection and tract. This opens the venue to employ machine learning for adaptive learning. The scenario looks beneficial. However, it is also a well-established fact in the literature that machine learning is also prone to adversarial attacks. Hashemi et al., in their study, has evaluated anomaly-based NIDS for test input. Specifically, they have trained the neural network model to handle adversarial information. They have opted for Kitsune to benchmark NIDS to train the network and test it for adversarial attacks. The scope of this work is only limited to one machine learning algorithm. The investigation of other machine learning algorithms was not present in this study [25].
In the year 2019 Qiu et al. [26] have proposed a novel adversarial network attack to see if the deep learning-based IDS are equally prone to adversarial attacks. They have reported that with this adversarial attack, the accuracy of Kitsune is compromised. In this work, they have merely two types of attacked were targeted, i.e., Mirai Botnet attack and video injections. The scope of this research can further be extended if the investigation on another dataset would also be explored. Moreover, the proposed adversarial attack can be validated to other well-known variants of deep learning models [26].
In the same year, Hashemi et al. [27] seconded the vulnerability of anomaly-based NIDS based on the neural network due to the adversarial attacks. In addition, the author has highlighted that the center of the work was on the older version of the dataset that was not truly mimicking the variety of network attacks., The research work has proposed Reconstruction from Partial Observation (RePO) as a novel appliance to build a NIDS. Functionally, that it uses de-noising auto encoder for detecting different types of network attacks in a low false alert setting. It also defend adversarial example attack. They have opted Kitsune dataset to validate their approach. Later, Zhong et al. [28] also opted for the Kitsune dataset as the benchmark to validate their proposed work. The scope of their work was tri-folded, i.e., practical, generic, and explainable. The 'Practical refers that the suggested attack can repeatedly transform original traffic with minimal information and overhead, while keeping the functionality stable. Second, the generic refers to the proposed attack as operative for evaluating the robustness of various NIDS. Finally, explainable means to propose a reasoning method for the robustness of ML-based NIDSs.
The same researcher in [28], extended their work by proposing a novel anomaly detection framework. This proposed framework was the integration and hybridization of multiple deep learning architectures. This framework first use the Damped Incremental Statistics algorithm to extract features from organic network traffic. Subsequently, the auto encoder was trained with a small amount of labeled data. Likewise, the dataset with abnormal scores is fed to LSTM. Finally, the weightage ranking approach was used to determine the abnormal score. Again the benchmarking of the results for the said work was done on the Kitsune dataset. This study only targeted the Mirai botnet attack dataset and ignored the other available dataset of Kitsune like Active Wiretap, ARP MitM, Fuzzing, OS Scan, SSDP Flood, SYN DoS, SSL renegotiation, Mirai, and Video Injection [29]. The authors of [28,29] presented their allied work in 2020 in which they publisheda practical traffic-space evasion attack on learning-based NIDSs. Again similar to [28] the scope of their work is tri-folded, i.e., practical, generic, and explainable. The 'practical' refers to providing a new framework to mutate malicious traffic with extremely limited information while keeping the functionality stable. The generic refers that the proposed attack was effective for any ML classifiers and non-payload-based features. Finally, 'explainable' means proposing a feature-based interpretation method to measure the robustness of targeted systems against such attacks [30].
Another group of researchers in [31] has presented a comparative analysis of FGSM, JSMA, C&W, and ENM over the Kitsune dataset in the same year [31]. Bai et al. Set up a new approach called FastFE. This high-speed function extractor affects the ability of next-generation programmable switches to deliver the desired traffic functions flexibly and efficiently. The authors have demonstrated that the advancement of FastFE and its low overheads over the Kitsune dataset [32]. Leevy et al. [33] have presented a new IDS called AE-IDF, inspired by Kitsune. However, both differentiate in feature selection in feature mapping. Kitsune used the damping window and dynamic feature retrieval, while AE-IDS used Random Forest to determine optimal functionality. In addition, Kitsune maps 'n' features to 'k' small subset using agglomerated hierarchical clustering; conversely, AE-IDS uses AP clustering to group features according to the degree of similarity. Refereeing research gaps 3 and 4, the dataset provided by Kitsune was used for the training and testing of the machine learning algorithm. Our previous study reffered that the variants of tree algorithms such as Simple Tree, Medium Tree, Coarse Tree, RUS Boosted, and Bagged Tree have reported similar effectiveness but with slight variation inefficiency. To further extend this investigation, we have explored the performance of variants of above said Tree algorithms on other datasets provided by Kitsune, such as Active Wiretap, ARP MitM, Fuzzing, OS Scan, SSDP Flood, SYN DoS, SSL renegotiation, Mirai, and Video Injection. This investigation ascertains the likely performance of above said tree algorithm variants. After a deep and rigorous investigation, the Fine Tree is highly recommended for the improved version of the Kitsune Tool.

Simulation Setup and Procedure
The proposed study was initiated from the benchmark dataset of Kitsune Network Attack Dataset, publically available on UCI Machine Learning repository. The attributes of the dataset are comprehended and analyzed using the dataset description. Afterwards each dataset was divided into 70% training and 30% testing samples, respectively with random permutation arrangement. The dataset is then converted from .CVS to .mat for Matlab. The simulation setup were established on a high performance computing machine preloaded with Matlab 2020. Following are the specifications of the high performance machine: The testing of algorithm were established on the Classification Learner App of Matlab 2020, where the performance of each algorithm as a function of confusion matrix, TPR, FNR, average training accuracy, test accuracy, misclassification cost, prediction time, and training time. After training each model were exported as a setup of corresponding .mat file. The 'predict function' of each exported model is used to evaluate the test accuracy using testing samples of dataset.

Simulation Results
This section present the investigation is performed on eight datasets of Kitsune, namely Active Wiretap, ARP MitM, Fuzzing, OS Scan, SSDP Flood, SYN DoS, SSL renegotiation, Mirai, and Video Injection. The investigation on the Mirai dataset was accomplished in our previous work [35]. Each dataset is divided into 70% training sample and 30% disjoint testing sample in random permutation selection of training and testing sample.
Active wiretapping [36] indicates adding false signals or tampering with communications or devices. This could be established on both guided and unguided media. In Kitsune Active Wiretap dataset have 1595082 instances of training sample and 683607 testing samples. In all the datasets of Kitsune, there are 115 input attributes and one output attributed. Primarily it is a binary classification domain where '0' represents 'No Attack' and '1' refers to the occurrence of "Attack".
A spoofing ARP [37], also known as ARP poisoning, is a middle man (MitM) attack. It allows attackers to intercept communication between and network devices. In Kitsune ARP MitM dataset have 1752987 instances of training sample and 751280 testing samples. Fuzz testing (fuzzing) is a quality assurance technique used to detect encryption errors and security vulnerabilities in software, operating systems or networks [38]. This involves capturing massive quantities of random data, called fuzz, about testing in an attempt to plant it. In Kitsune, it has 1752987 instances of training sample and 751280 samples of testing samples. The OS scan works by using the TCP/IP stack fingerprinting method [39]. Service analytics works by using the N map-service-probes database to identify services performed on a targeted host. In Kitsune, it has 1188496 instances of training sample and 509355 samples of testing samples.
A Simple Service Discovery Protocol (SSDP) [40] attack is a reflection-based distributed denial-ofservice (DDoS) attack. It uses Universal Plug and Play (UPnP) networking protocols to send an amplified amount of traffic to a targeted victim. In Kitsune, it has 2854086 instances of training sample and 1223180 samples of testing samples. SSL renegotiation messages (including ciphers and encryption keys) are encrypted and then sent over the existing SSL connection. In Kitsune, it has 1545300 instances of training sample and 662271 testing samples. An SYN flood is a form of DoS attack in which an attacker sends a succession of SYN requests to a target's system [41]. It made an attempt to guzzle enough server resources to make the system unresponsive to authentic traffic. In Kitsune, it has 1939893 instances of training sample and 831383 samples of testing samples.
Tabs. 1-8 illustrate the comprehensive empirical evaluation of eight Kitsune datasets. It can be observed from these tables that all the variants of Tree algorithms are yielding approximately 100% TPR and FNR. The class-wise accuracy is evident from the confusion matrix of each algorithm for each dataset. Therefore, the net training accuracy for almost every variant of the Tree Algorithm in this study seems to be identical. Likewise, the misclassification pattern also seems to be very consistent as for Active Wiretap, ARP MitM, Video Injection and Coarse Tree which reported with the maximum misclassification cost with these attacks. However, Boosted Tree is reported worst with maximum misclassification cost for Fuzzing, OS Scan, SSDP Flood, SSL Renegotiation, and SYN Dos. It has been observed that for every dataset, Bagged Tree is reported with no misclassification cost. However, the prediction speed is on average 9.5 time less than Fine Tree. The fine tree shows less than 1% compromise on test accuracy and misclassification cost.
Similarly, the training time of Bagged Tree is approximately fouretime to the training time of Fine Tree and Medium Tree. It is inferred that the Fine Tree and Medium Tree are the best optimization tool for unified network attack detection on Kitsune. It is highly recommended that the improved version of Kitsune use either Fine Tree or Medium Tree as universal classifiers. Due to the identical and close behavior of training accuracy, misclassification cost, prediction speed, testing accuracy, only the pictorial illustration of performance measures on Active Wiretap is illustrated in Figs. [2][3][4][5]. Given that, the amplitude of variation can be derive from the Tabs. 1-8.         Fig. 3 illustrate the total misclassification cost on Active Wiretap for the set of algorithms. In this figure, x-axis represent the total cost count and the y-axis illustrate the algorithm. It is inferred from the figure that Coarse Tree has reported to have the significantly high misclassification cost as compare to the other varients of machine learning algorithm. This advocate strongly against the Coarse Tree for it utility in real-time network attack detection.   x-axis represent Prediction Speed (Obs/s) and the y-axis illustrate the algorithm. It can be inferred that of the then RUS Boosted, and Bagged Tree the rest of the algorithm shows significant improved prediction speed. Therefore, The Boosted Tree, Coarse Tree, Medium Tree, and Fine Tree wins the race in terms of prediction speed. Fig. 5 illustrate the Test Accuracy cost of Active Wiretap for the set of algorithms. In this figure, x-axis represent test accuracy and the y-axis illustrate the algorithm. It is inferred from this figure that other then Coarse Tree the rest of the algorithm are found to be efficient in term of testing accuracy.

Analysis
The above parametric assessment and evaluation established the response of machine learning algorithms for network intrusion detection in the adversarial nature of test inputs. It is very important to notice that the variants of tree algorithms have submitted the competitive performance during the training process. This can be evident from the confusion matrix of each assessment. Likewise the training accuracy also advocate for the same fact. However, the abrupt performance degradation of testing accuracy and mis-classification cost has been noted. It is mainly due to the adversarial nature of data. Given that the dataset have binary classes, the class level accuracy of a certain algorithm badly hit by adversarial attack.
At the same time the computational efficiency of each algorithm are also recorded to be in close vicinity. It is primarily due to the fact that the adversarial attack in the testing dataset do not notably affect the computation cost. Rather, it badly hit the class level accuracy of the testing results. It is very hard to limit the adversarial traces in the busty traffic such as the network traffic. The preventive measure to detect or mitigate the adversarial traces with no only case to compromise the network efficiency but also found to be nearly impractical for the real time communication such as network traffic, video surveillance etc. This study help to figure out the most suitable machine learning algorithm for Kitsune that is not very much prone to the adversarial attack on in the network intrusion dataset. It is also important to note that this finding may vary for other dataset due to structural difference of application and dataset. Kitsune being an important network intrusion detection have a pressing need to have empirical justification to opt a machine learning algorithm. Given that it should not create the adverse effect on the computation efficiency. After the rigorous simulation and parametric evaluation it is concluded that the Fine Tree is found to be the most optimum machine learning for the improved version of Kitsune.

Conclusion
Kitsune being the machine learning-based NIDS, to the best of the knowledge, no researcher has presented a comprehensive comparative analysis of machine learning algorithm on Kitsune. Many researchers have only investigated the Mirai botnet dataset of Kitsune, and None has investigated the complete dataset of Kitsune, including Active Wiretap, ARP MitM, Fuzzing, OS Scan, SSDP Flood, SYN DoS, SSL renegotiation, Mirai, and Video Injection. Our previous study refers that the variants of tree algorithms such as Simple Tree, Medium Tree, Coarse Tree, RUS Boosted, and Bagged Tree have reported similar effectiveness but with slight variation inefficiency. To further extend this investigation, we have explored the performance of variants of above said tree algorithms on other datasets provided by Kitsune, such as OS Scan, Fuzzing, Video Injection, ARP MitM, Active Wiretap, SSDP Flood, SYN DoS, SSL, and Renegotiation. This study help to figure out the most suitable machine learning algorithm for Kitsune that is not very much prone to the adversarial attack on in the network intrusion dataset. It is also important to note that this finding may vary for other dataset due to structural difference of application and dataset. This investigation ascertains the likely performance of above said tree algorithm variants. After a deep and rigorous investigation, the Fine Tree is highly recommended for the improved version of the Kitsune Tool.