Internet of Things (IoT) devices work mainly in wireless mediums; requiring different Intrusion Detection System (IDS) kind of solutions to leverage 802.11 header information for intrusion detection. Wireless-specific traffic features with high information gain are primarily found in data link layers rather than application layers in wired networks. This survey investigates some of the complexities and challenges in deploying wireless IDS in terms of data collection methods, IDS techniques, IDS placement strategies, and traffic data analysis techniques. This paper’s main finding highlights the lack of available network traces for training modern machine-learning models against IoT specific intrusions. Specifically, the Knowledge Discovery in Databases (KDD) Cup dataset is reviewed to highlight the design challenges of wireless intrusion detection based on current data attributes and proposed several guidelines to future-proof following traffic capture methods in the wireless network (WN). The paper starts with a review of various intrusion detection techniques, data collection methods and placement methods. The main goal of this paper is to study the design challenges of deploying intrusion detection system in a wireless environment. Intrusion detection system deployment in a wireless environment is not as straightforward as in the wired network environment due to the architectural complexities. So this paper reviews the traditional wired intrusion detection deployment methods and discusses how these techniques could be adopted into the wireless environment and also highlights the design challenges in the wireless environment. The main wireless environments to look into would be Wireless Sensor Networks (WSN), Mobile
Computer system must ensure Confidentiality and integrity against network security attacks. Jhanjhi et al. [
Tartakovsky et al. [
Since IDs consists of many components and features, a detailed analysis on these as a complete survey could contribute to the literature. This survey paper differs from others in the literature by providing a comparison between wired and WNs and providing the taxonomy of IDs in wired and WNs as shown in
Frequent change of topology in Open protocol which is vulnerable to many attacks—Since there is no fixed wired connection to the nodes in Hard to detect by just looking at MAC address (wireless IDS survey)—Although there can be some mechanisms to detect the MAC address of the attack nodes, the node can easily change the MAC address since MAC address configuration in wireless just works by software configuration. Nodes are always mobile—It is very hard to deploy a centralized IDs since the nodes in Resource Limitation of wireless nodes and wireless channels—Most of the times, nodes in Very high false positive and false negative—The dynamic organization of the network in
The wired or standard IDS architecture used to connect all the devices with a cable. The IDS console will play the role to monitor and analyze the network traffic. When traffic or packet is coming from the internet, the router will pass the data to the IDS server, the IDS server will do the traffic collection process and machine learning process. Basically, the IDS do not drop any packet since the job of IDS is to collect and analyze the data. The wired IDS required more component and device for the network setup, it mainly includes router, switch, IDS console, IDS server, and other end devices.
The wireless IDS architecture looks like a wired IDS architecture, but the difference is use of wireless access point for the network connectivity. The wireless IDS architecture is more convenient to the
Although IDS is used to detect any intrusion, it has its own downside. The main issue with the IDS technologies is the accuracy of the detection. The accuracy of the IDS technologies can be measured by two parameters, false positive (FP) and false negative (FN). FP is generated when the system identified an intrusion, but it is actually not. For FN it is generated when the system does not detect any intrusion but in fact, intrusion happened. The other way to look at FN is that the system fails to detect the intrusion. To contemplate on the security of a system, a large number of security system administrator tend to choose to decrease FNs and increase FPs [
So, in order to propose and implement a suitable IDS in wireless
Data collection in wired and wireless network for the purpose of intrusion detection can be either collected from the behavior based or traffic analysis. Behavior based data collection is normally focusing much on the performance of the system such as Windows error reporting, web server performance, console log files [
There are a lot of traffic analysis related data already available which consists of attack and non-attack data. KDD cup is one such example which has about 22 different attacks related to network and transport layer which will be discussed in detail in Section 3.4. There are other related datasets available such as Predict 2014, Caida 2014, Kyoto Dataset 2014, ICS Attack Dataset 2014 and Adfa intrusion detection datasets 2014 [
This section describes the commonly used intrusion detection techniques
Based on the survey, Liao et al. [
In Anomaly based intrusion as shown in
Based on statistical based anomaly IDS, the network traffic behavior profile is created. The profile is set as a reference when the network traffic is running in normal condition. The IDS will continue comparing the new profile data with the reference created earlier. When the profile shows a significant mismatch from the reference, then the traffic is flagged as abnormal. Whereas in knowledge-based anomaly IDS, the intrusion is detected by using the current network traffic or data whether being in the normal condition or in abnormal condition. Knowledge based intrusion can be performed by using expert system, description languages like Unified Modelling Language (UML), Finite State Machine (FSM) and clustering algorithms [
Machine learning based is more automatic in the sense that the system is able to learn the network profile and use it to detect any intrusive activities in the network. Machine learning based IDs is discussed in further detail in the following section due to its popularity. In 1959, Arthur Samuel defined Machine learning (ML) as “field of study that gives computers the ability to learn without being explicitly programmed”. Basically, there are two things that the ML do that is classified and predict the data depends on the properties of the data that ML learns during the training phase. Also, ML requires an objective. The three main learning approaches in ML is unsupervised, semi-supervised, and supervised. Among the common method of these approaches is using support vector machine which is presented in Bhatti et al. [
Artificial neural network (ANN) is designed to work like the human brain. This has made the ANN to be much more capable than the usual machine learning models. A neural network consists of artificial neurons called units in each of the layers. The unit in a layer is connected to each of the unit in the next layer. An ANN has at least three layers, the input layer, hidden layers, and the output layer. The input layer serves as the way for the ANN to receive information and the output unit will respond accordingly after the information is being processed and learned. The hidden layers are located between the input layer and the output layer. There could have one or multiple hidden layers in ANN that structure most of the artificial brain. There is one more important feature in the ANN which is called a weight. Every connection in ANN has a weight and its value could be either positive or negative. The main objective of ANN is to learn and retrain the information in compliance with the input data and the output data [
Some ANN based IDS have been done and a survey was carried to compare the different ANN models in Shah et al. [
A lot of framework has been used for deep learning networks whereby [
Anomaly based IDs is very good at detecting new and unanticipated vulnerabilities but are less dependent on operating system. But anomaly-based IDs can produce low detection accuracy due to constant change of activities and are normally not available when new profiles are being built. One key advantage of anomaly-based IDs is, it does not really look for any specific activities which means it does not need to fully specify all attack vectors and does not require the dictionary to be fully up to date. But this can also possibly cause more false positive signals. The system can also be vulnerable during the testing or profiling phase. In anomaly based detection, the normal behavior must be updated regularly since the network behavior changes frequently [
Specification based intrusion detection focuses more on anomaly at the system level as compared to anomaly-based IDS that looks for anomaly at user profiles or data flows. But it works in the similar way whereby it defines the normal behaviors and detects anomaly when the system deviates from the normal behavior. This IDS produces lesser false positives than anomaly-based IDS since the system learns that only what legitimate behaviors defined by the expert is classified as normal and otherwise it is classified as abnormal. In another word, this system only works well only with the bad behaviors that disrupts the defined specifications in the system. The system is also effective in the sense that no training phase is required that makes it available immediately. The only disadvantage is that a lot of effort is required to define the formal specifications. This kind of IDS is effective in detecting insider attacks as it looks for abnormal behaviors in the system mainly on the system disruptions. On the other hand, it is not effective in detecting outsider attacks because it is mainly taking actions performed by insider and it is very much application centric. It is a kind of anomaly detection without having specific user, group or data profiles. The legitimate behaviors are defined by human and anything that deviates from this is specified as misbehaving nodes. This kind of IDS is suitable for nodes that is resource constraint whereby user, group or data profiles cannot be stored [
This kind of IDS is different from the earlier IDS discussed as this IDS normally looks for selfish nodes rather than looking for malicious nodes. But in the event that a misbehavior node is detected, the reputation manager has to look for ways to look into guarding the network in order to keep the reputation. One main challenge in this system is the distribution of the challenge score. Example of challenge scores are like packets sourced over packets destined, packets forwarded over packets sourced and many more. This kind of approach is suitable for large networks where
This section describes the architecture of IDs in
As we know, IDS can be deployed in every single node for monitoring node performances which is known as Host-based IDS (HIDS) (
A Distributed IDS (DIDS) is essentially an IDS which contains multiple IDS such as HIDS, NIDS, etc. It is most likely to be deployed in a large network which require different types of IDS to monitors the network traffics for intrusive events. DIDS uses detection components and correlation managers to connect and combine information gathered from those IDSs. DIDS is able to make use of both anomaly and signature-based intrusion detection, granting it the ability to detect both known and unknown attacks from the hackers [
Standalone architecture is very similar in concept as NIDS whereby IDs runs in every single node. The decision is made based on the information collected from the independent node. Node do not communicate or cooperate in order to make IDS decisions and therefore no information is exchanged between nodes. In this kind of IDS, nodes within the same network does not have any information on activities on other nodes as no alert information is shared among nodes. This approach may not be a very viable solution unless each node can run independently on its own without any limitations in terms of processing and storage capacity. Moreover, this approach is more suitable for flat architecture as compared to hierarchical architecture. This IDS is not a suitable solution for MANET and IoT as information collected by each node is not sufficient to detect malicious events [
A collaborative IDS is combination of several HIDS, and NIDS deployed over a large network that communicates with each other or to a centralized system for network monitoring purposes. In a collaborative IDS, the individual system can collect intrusion data, analyze and respond by itself or can be sent to a central system or even can be distributed to multiple systems amongst each other. Therefore, a collaborative IDS can be a centralized IDS, distributed IDS and can be a hierarchical IDS system too. This kind of collaborative IDS systems is useful since it can detect known and unknown attacks as it has both NIDS and HIDS as a whole [
In centralized IDS as shown in
Distributed IDS as presented in
Since centralized IDS is not scalable in nature, hierarchical architecture is proposed for that reason. In hierarchical architecture, nodes join into group of similar nature such as geography, administrative control, similarity in software platforms and types of intrusions. In hierarchical IDS, the entire system is classified into clusters by having one single node as cluster head in each cluster. All other nodes report to the cluster head in their respective cluster. Every single node is equipped with an IDS agent responsible for monitoring and deciding intrusions in its local node. A cluster head is also responsible for its local node as well as globally to collect intrusion data from its member nodes and deciding on the response event. In some cases, the analysis from the cluster head nodes will be further sent to the higher nodes for further processing [
MANET also introduces mobile agents (MA) in its IDS deployment. Some nodes deployed as mobile agents to perform one specific task whereas all other nodes perform more functions. Due to its mobility nature, one or more mobile agents are distributed in each network. Mobile agent-based IDS (
One of the weaknesses of traditional wired IDS is it does not generally detect network intrusion from internal hosts of the network. Although it is possible to protect an organization internal network from wireless attackers, make sure there is only one link between the WNs and the main network, such a network IDs will not cover all of the traffic on the WN [
The IoT connected devices in 2018 was 23.14 billion, and there are different behaviors and characteristics of network traffic on each IoT devices, so should select the devices that is commonly used by people. After that, one must collect a huge amount of data on the IoT network in order to understand the IoT network traffic behavior and the characteristic. The lack of availability of large real-world datasets for IoT has become the challenge for IDS in IoT. At last, need to select a set of algorithms for analyzing the IoT network traffic with the KDD Cup data set. Khammassi et al. [
Chen et al. [
Bridges et al. [
Mobile
So in order to address some of these challenges, the wireless IDS should be distributed and collaborative in nature and this is proven in the work by Mohammadi et al. [
This paper presents a survey to re-architecting IDS design to accommodate IoT and Manet characteristics. We holistically presented the review, from basic IDS deployment strategies to traffic analysis and wireless network recommendations. The overall survey gives the reader a complete understanding of what an IDS is, the data or traffic involved in detecting an anomaly, and some design challenges in adopting wired IDS design into wireless IDS. The paper's main highlight focuses on the design challenges of IDS in wireless networks such as MANET, IoT, and VANET. Based on existing research gaps, traffic headers specific to wireless networks (from 802.11 frames and data link layer) should be more heavily weighted in the network analysis. The paper concludes with several recommendations and guidelines for IDS design that are mainly effective against intrusions in the wireless space.