A Blockchain-Based Architecture for Enabling Cybersecurity in the Internet-of-Critical Infrastructures

: Due to the drastic increase in the number of critical infrastructures like nuclear plants, industrial control systems (ICS), transportation, it becomes highly vulnerable to several attacks. They become the major targets of cyberattacks due to the increase in number of interconnections with other networks. Several research works have focused on the design of intrusion detection systems (IDS) using machine learning (ML) and deep learning (DL) models. At the same time, Blockchain (BC) technology can be applied to improve the security level. In order to resolve the security issues that exist in the critical infrastructures and ICS, this study designs a novel BC with deep learning empowered cyber-attack detection (BDLE-CAD) in critical infrastructures and ICS. The proposed BDLE-CAD technique aims to identify the existence of intrusions in the network. In addition, the presented enhanced chimp optimization based feature selection (ECOA-FS) technique is applied for the selection of optimal subset of features. Moreover, the optimal deep neural network (DNN) with search and rescue (SAR) optimizer is applied for the detection and classification of intrusions. Furthermore, a BC enabled integrity checking scheme (BEICS) has been presented to defend against the misrouting attacks. The experimental result analysis of the BDLE-CAD technique takes place and the results are inspected under varying aspects. The simulation analysis pointed out the supremacy of the BDLE-CAD technique over the recent state of art techniques with the accu y of 92.63%.


Introduction
Critical infrastructure system has been utilized for underpinning the functions of an economy and society. Also, it ranges from conventionally-defined physical assets to a broader description of current assets in the fields of agriculture, gas, transportation, water supply, electricity, telecommunication, public health, security services, and so on [1]. Such transformation is mainly because of the extensive utilization of Internet of Things (IoT) and their considerable aid for critical infrastructure systems in industry 4.0 [2]. The IoT system has become essential part of critical infrastructure in industry 4.0, which creates smart services like smart grids and offers numerous benefits for efficiencies and cost savings. The international data corporation (IDC) have predicted that there would be an estimation of 41.6 billion interconnected IoT device, which generate 79.4 zettabytes (ZB) by 2025 [3]. The industrial control system (ICS) is the core of critical infrastructure system [4]. It is largely accountable for supervisory control and data collection (SCADA), which monitors the control flows and processes of data in industry. The possible application areas of critical infrastructure with IoT are shown in Fig. 1 [5]. The wider adaption of Internet connected IoT devices have offered different challenges to critical infrastructure. Initially, ICS was mainly developed for a closed infrastructure and proprietary without taking care of security problems into account, since conventional critical infrastructure is kind of isolated and is invulnerable to cyber-attacks. With this infrastructure being interconnected to the Internet via IoT system, a wide-ranging of cyberattacks, including malware, Man-in-the-middle attack, distributed denial-of service (DDoS), Brute force, breach, and phishing attacks are threatening the process of ICS [6,7]. The compromised ICS by cyber attackers might generate possible risk for the loss of information [8]. Next, scalability is another challenge where ICS wasn't initially developed to resolve. Assuming the dramatic growth in the volume of data and the number of IoT devices they are analyzing and collecting, the centralized method for data analysis and collection has become a bottleneck of ICS. A decentralized method is crucially needed to satisfy the evolving needs of ICS. Blockchain (BC) and Artificial intelligence (AI) have their own benefits, but, all of them have relative drawbacks. BC has problems relating to scalability, security, energy consumption, efficiency, and privacy, whereas AI systems face problems like effectiveness and interpretability. As two distinct directions of research, they could be associated with one another and have the benefits of natural integration. Both techniques have shared requirements for data trust, analysis, and security, and they could empower one another [9]. For example, AI technique based on three most important components: computing power, data, and algorithms, and the BC could break the island of data and realize the flow of data resources, algorithms, and computing power, according to its specific features, involving immutability, anonymization, and decentralization. Additionally, BC could ensure the audit traceability and credibility of AI and the credibility of the original data. Furthermore, BC could record the decision-making of AI that assists in analyzing and understanding the behaviour of AI and eventually promote the decision-making of AI, which makes it more explainable, trustworthy, and transparent. The AI technique could improve the BC construction for making it more efficient, secure, and energy-saving [10].
Gumaei et al. [11] presented an architecture which integrates a BC with a deep recurrent neural network (DRNN) and edge computing for 5G-enabled assisted mode detection and drone identification. In the presented approach, raw RF signals of dissimilar drones under various flight modes are collected and sensed remotely on a cloud framework for training a DRNN method and allocate the training models on edge devices to detect their flight modes and drones. BC is utilized in this architecture for securing data transmission and integrity. Alkadi et al. [12] presented a DBF to provide security-based privacy-based BC and distributed IDS with smart contracts in IoT networks. The IDS is applied by a BiLSTM-DL method for handling sequential network data and is measured by the data sets. The smart contract and privacy-based BC methodologies are designed by utilizing the Ethereum library to offer security to the distributed IDS engine.
Singh et al. [13] introduced a DL-based IoT-based framework for a secured smart city in which BC provides a distributed platform at the transmission stage of software defined networks (SDN) and cyber-physical systems (CPS) established the protocol for forwarding information. A DL based cloud is employed at the application layer for resolving transmission scalability, centralization, and latency. Zhang et al. [14] presented an edge intelligence and BC enabled industrial IoT architecture that attains secure and flexible edge service management. Next, developed a credit-differentiated edge transaction approval method and present a cross-domain sharing inspired edge resource scheduling system. This study designs a novel BC with deep learning empowered cyber-attack detection (BDLE-CAD) in critical infrastructures and ICS. The proposed BDLE-CAD technique aims to identify the existence of intrusions in the network. In addition, the presented enhanced chimp optimization based feature selection (ECOA-FS) technique is applied for the selection of optimal subset of features. Moreover, the optimal deep neural network (DNN) with search and rescue (SAR) optimizer is applied for the detection and classification of intrusions. Furthermore, a BC enabled integrity checking scheme (BEICS) has been presented to defend the misrouting attacks. The experimental result analysis of the BDLE-CAD technique takes place and the results are inspected under varying aspects.

The Proposed Model
In this study, a new BDLE-CAD technique has been developed to identify the existence of intrusions in critical infrastructures. The proposed BDLE-CAD technique encompasses ECOA-FS technique for the selection of optimal subset of features. Moreover, the DNN with SAR optimizer is can be used as a classifier and the BEICS has been presented to defend over the misrouting attacks. The experimental result analysis of the BDLE-CAD technique takes place and the results are inspected under varying aspects.

ECOA Based Feature Selection
Primarily, the ECOA-FS technique is executed to choose the optimal subset of features. The chimp optimization algorithm (COA) is a mathematical method that is dependent upon intelligent diversity [15]. Drive, chase, block, and attack are capable of 4 distinct kinds of chimps that are realized by attacker, obstacle, chaser, and driver. The 4 hunting stages are finalized in 2 phases. In primary stage is the exploration step, and the second step is the exploitation phase. The exploration phase contains driving, blocking, and chasing the prey. Since the exploitation step, it has attacked the prey. Where the drive and chase are demonstrated as in Eqs. (1) and (2).
where X prey implies the vector of prey place, x chimp refers the vector of chimp place, t stands for the amount of present iterations, a, c, m represents the coefficient vector and it is attained with Eqs.
where f implies the non-linearly declined in 2.5 to 0, r 1 and r 2 refers the arbitrary number amongst zero and one, and m refers the chaotic vector. The dynamic coefficient f has chosen to distinct curve as well as slope, so the chimps are utilizing distinct capabilities for searching the prey. The chimps are upgrading their places dependent upon another chimp, and this mathematical method is signified by Eqs. (6) and (8).
x(t + 1) = In ECOA, the extremely disruptive polynomial mutation is increased version of polynomial mutation technique [16]. It could resolve the limitation that polynomial mutation technique is fall as to local optimal once the variable is nearby boundary. In Eqs. (9)-(12) illustrate the procedure of HDPM modifies the x i where ub and lb define the upper and lower boundaries of the search spaces. r signifies the arbitrary number amongst zero and one. η m refers the distribution exponential that is a non-negative number.
Since it is clear that the previous formula, HDPM is exploring the total search space.
In contrast to the classical ECOA, in which the update of solutions takes place in the search area in the direction of continuous value location. However, in the BECOA, the searching area can be defined by n dimension Boolean lattice. In addition, the solutions get updated using the corner of a hypercube. Moreover, for selecting the features, 1 represents the selection of features, otherwise 0. In addition, the BECOA derived a fitness function in determining the solutions for maintaining a tradeoff between a pair of objectives, as given in Eq. (13): Δ R (D) denotes the error of the classifier, |Y | represents the subset size, and |T| indicates the total number of features that exist in the dataset. Besides, α signifies a variable ∈ [0, 1] related to the weight of the classification error level, and β = 1 − α symbolizes the significance of reduction feature.

Optimal DNN Based Intrusion Detection and Classification
At this stage, the chosen features are passed into the DNN model for intrusion classification. The DNN is an ANN that consists of input, hidden, and output layers. The hidden layer applies a group of non-linear functions and it can be demonstrated as follows [17]: where x refers the input of all nodes, W and bias signifies the weight as well as bias vectors correspondingly and sig implies the sigmoid activation functions, for instance, 1 1+e −x . During the presented optimizing DNN, 2 hidden layers are assumed and for minimizing the MAE of DNN, optimum selective of weight matrices are required in order, at this point SAR has been employed. The searching and rescuing function has important 2 stages, for instance, social as well as individual phases. In the searching procedure, the set members collect the clues. The clues left under the search by group members were saved from the memory matrix (O) but the human place is saved from the place matrix (W). The clue matrix B with size N * D that has of left clues and the human places are expressed as: The 2 steps of human search are demonstrated as follows. i) Social step: The search way has provided by SD i = (W i − B k ) where k = i. A novel solution has been created utilizing the formula. ii) Individual step: According to the present place humans identify its novel place and novel place of i th human is provided as [18]: Every solution is placed from the solution spaces, once the novel place is outer the solution space then it can be enhanced utilizing the formula where W max j and W min j implies the maximal as well as minimal threshold. The performance of determining the global optimum solution has improved by providing memory upgrade formulas where ME n is n rh has saved clue place from the memory matrix and n refers the arbitrary integer number ranging amongst [1, N]. During the clue search procedure, once optimum clues are not initiated nearby the present place a specific amount of searches, human goes to novel place. For modeling, this, primary, the USN is fixed 0 to all humans.
Once the USN value is superior to the maximal unsuccessful searching number, the human becomes an arbitrary place from the searching space utilized in Eq. (21), and the value of USN i is fixed 0 to that human.
where r 4 ranges from the interval zero and one.

Process Involved in BEICS
The BC [19] is a major component of the integrity verification system. The primary concept is to offer a solution in which that every flow produced from the controller is saved in a verifiable and immutable dataset. The BC includes a series of blocks interconnected to one another via hash values. At the BC network, the users contain a pair of keys namely private key for signing the BC transaction and public key for representing the irreplaceable address. The client signed a transaction by the use of private key and transmit it to the other ones in the network for verification. Once the broadcasting block gets verified, it is added to the BC. If it is saved, the data in the provided blocks could not be modified with no changes of all succeeding blocks. Besides, the data is present in many hosts concurrently, therefore, the modifications can be discarded by the peer hosts. Here, a private BC has been presented in contrast to a public BC. The private BC decides who can get participated in the network and represented actions as well as permissions allotted identifiable applicants. Therefore, it limits the need for consensus mechanisms like Proof of Work. Fig. 2 shows the structure of BC.

Experimental Validation
In this section, the performance validation of the BDLE-CDE technique takes place using benchmark dataset [20], which comprises 1000's different classes of events. The dataset contains binary (Natural and Attack) and multiclass (No event, Natural, and Attack) labels. Tab. 1 provides a detailed result analysis of the BDLE-CDE technique on the binary class dataset.  . 3 offers a brief prec n and reca l analysis of the BDLE-CDE technique under distinct subdata on binary class dataset. The figure revealed that the BDLE-CDE technique has attained increased values of prec n and reca l . For instance, with subdata-1, the BDLE-CDE technique has offered prec n and reca l of 96.89% and 97.47% respectively. Meanwhile, with subdata-10, the BDLE-CDE technique has provided prec n and reca l of 97.80% and 97.97% respectively. Eventually, with subdata-15, the BDLE-CDE technique has demonstrated prec n and reca l of 97.76% and 97.04% respectively.  The results show that the BDLE-CDE technique has gained a lower accu y of 98.30% on subdata-7 and higher accu y of 98.91% on subdata-14. Therefore, it is ensured that the BDLE-CDE technique has effectually classified binary classes.   6 showcases a brief prec n and reca l analysis of the BDLE-CDE technique under distinct subdata on multi class datasets. The figure discovered that the BDLE-CDE technique has attained increased values of prec n and reca l . For instance, with subdata-1, the BDLE-CDE technique has presented prec n and reca l of 79% and 92.09% respectively. Meanwhile, with subdata-10, the BDLE-CDE technique has delivered prec n and reca l of 83.37% and 92.39% respectively. Finally, with subdata-15, the BDLE-CDE technique has demonstrated prec n and reca l of 78.62% and 89.25% respectively. in maximum values of spec y and F score . For instance, with subdata-1, the BDLE-CDE technique has demonstrated spec y and F score of 93.46% and 80.55% respectively. Moreover, with subdata-10, the BDLE-CDE technique has gained spec y and F score of 93.87% and 80.67% respectively. Furthermore, with subdata-15, the BDLE-CDE technique has reached spec y and F score of 93.75% and 82.46% respectively.   Tab. 3 offers a detailed comparative study of the BDLE-CDE technique with recent methods [21]. A comparative classification result analysis of the BDLE-CDE technique on the binary class dataset is depicted in Fig. 9. The results exposed that the Nearest Neighbor (NN), random forest (RF), and SVM models have obtained lower accu y of 71.56%, 80.61%, and 78.84% respectively. At the same time, the KNN, Adaboost+JRip, and JRip models have obtained moderate accu y values of 95.49%, 95.56%, and 90.10%. However, the BDLE-CDE technique has resulted in increased accu y of 98.63%.  Detailed multiclass performance analysis of the BDLE-CDE technique on the multi class dataset is offered in Fig. 10. The experimental values illustrated that the Nearest Neighbor (NN), random forest (RF), and SVM models have gained reduced accu y of 77.33%, 80.51%, and 78.56% respectively. Moreover, the KNN, Adaboost+JRip, and JRip models have obtained moderate accu y values of 87.66%, 91.44%, and 90.09%. However, the BDLE-CDE technique has accomplished superior accu y of 92.63%. From these results and discussion, it can be ensured that the BDLE-CDE technique has the ability to attain maximum performance over the other compared methods. In this study, a new BDLE-CAD technique has been developed to identify the existence of intrusions in critical infrastructures. The proposed BDLE-CAD technique encompasses ECOA-FS technique for the selection of optimal subset of features. Moreover, the DNN with SAR optimizer is can be used as a classifier and the BEICS has been presented to defend over the misrouting attacks. The experimental result analysis of the BDLE-CAD technique takes place and the results are inspected under varying aspects. The simulation analysis pointed out the supremacy of the BDLE-CAD technique over the recent state of art techniques with the accomplished superior accu y of 92.63%. Therefore, the BDLE-CAD technique can be utilized as a proficient tool to detect intrusions in the network. In future, clustering and outlier detection approaches can be designed to boost the detection performance.