Replacing or recharging batteries in the sensor nodes of a wireless sensor network (WSN) is a significant challenge. Therefore, efficient power utilization by sensors is a critical requirement, and it is closely related to the life span of the network. Once a sensor node consumes all its energy, it will no longer function properly. Therefore, various protocols have been proposed to minimize the energy consumption of sensors and thus prolong the network operation. Recently, clustering algorithms combined with artificial intelligence have been proposed for this purpose. In particular, various protocols employ the K-means clustering algorithm, which is a machine learning method. The number of clustering configurations required by the K-means clustering algorithm is greater than that required by the hierarchical algorithm. Further, the selection of the cluster heads considers only the residual energy of the nodes without accounting for the transmission distance to the base station. In terms of energy consumption, the residual energy of each node, the transmission distance, the cluster head location, and the central relative position within the cluster should be considered simultaneously. In this paper, we propose the KOCED (K-means with Optimal clustering for WSN considering Centrality, Energy, and Distance) protocol, which considers the residual energy of nodes as well as the distances to the central point of the cluster and the base station. A performance comparison shows that the KOCED protocol outperforms the LEACH protocol by 259% (223 rounds) for first node dead (FND) and 164% (280 rounds) with 80% alive nodes.
The sensor nodes in a wireless sensor network (WSN) are often installed in large numbers, especially in places that are not easily accessible. This makes it is difficult to replace or recharge their batteries. Consequently, efficient utilization of the limited energy of sensor nodes is a critical requirement [
The sensor nodes of a WSN are usually installed in places that are difficult to access. Therefore, sensor network protocols should have self-configuring functionality whereby the sensor nodes work in cooperation with each other. Sensor nodes that are deployed on surveillance missions in specific areas detect data that are internally converted into high-level information through data processing procedures of the network. Then, the data are sent to remote administrators. In the routing process, the raw or processed data are transmitted to the final destination over a single hop or multiple hops. These hopping mechanisms are among the many issues that need to be addressed so that WSNs can be employed in various applications. A number of routing protocol studies are under way for this purpose [
The LEACH protocol is a hierarchical clustering algorithm for efficient energy utilization. It uses a T(n) critical expression for cluster head selection. However, this does not always guarantee that an optimal cluster will be formed. In addition, the number of nodes in a cluster is changeable, i.e., it may be small or large. As the rounds progress, the first node dies. However, if a node with low residual energy is selected as the cluster head (CH), the data transmission is more likely to fail. To prevent nodes with low residual energy from being selected as CHs, this study considers the CH selection probability critical expression.
Wireless sensor protocols that use K-means clustering do not configure clusters after selecting the CHs. The main advantage of this approach is that clusters are uniformly constructed and most of the member nodes of a cluster are evenly distributed in their affiliated clusters. After the clusters are configured, this method selects nodes with high residual energy or nodes that are close to the central point of the cluster as CHs. However, a disadvantage of this approach is that it takes a longer time to form clusters than the LEACH algorithm because the clustering configuration process repeatedly selects the central point to appoint the CHs. Protocols based on K-means clustering consider the residual energy of nodes and the central point to select CHs. Distances for data transmission that involves high energy consumption, especially from CH to BS, are not considered [
Consequently, using the K-means clustering algorithm in the WSN protocol ensures that the member nodes in the cluster are uniformly distributed. However, it takes a longer time to configure the clusters compared to traditional hierarchical algorithms. In addition, there is a problem of selecting consecutive CHs on the same node because the CH selection considers only the residual energy of the node closest to the central point of the cluster. This drawback consequently reduces the life span of the entire network. Moreover, the overall energy efficiency of the network cannot be considered because the distance to the base station is not taken into account when selecting the CH. Further, the number of clusters (k) is not optimized.
This paper proposes KOCED (K-means with Optimal clustering for WSN considering Centrality, Energy, and Distance), a routing protocol that improves the energy efficiency of the entire network while addressing the above-mentioned problems. A performance comparison shows that the KOCED protocol outperforms the LEACH protocol by 259% (223 rounds) for first node dead (FND) and 164% (280 rounds) with 80% alive nodes.
The remainder of this paper is organized as follows. Section 2 reviews related algorithms. Section 3 presents the KOCED protocol. Section 4 describes the simulations conducted in this study. Finally, Section 5 concludes the paper.
The LEACH (Low-Energy Adaptive Clustering Hierarchy) protocol is a representative clustering-based protocol proposed by Wendy B. Heinzelman [
where G is a collection of nodes that have not been selected as CHs until the current round and the previous round. Each node generates a random value between 0 and 1 and compares it to
The process of forming a cluster is summarized as follows. After the CHs are first selected on the basis of
K-means clustering [
Another method is to monitor the results by increasing the number of clusters sequentially using the elbow method, as adding another cluster does not improve the modeling of the data significantly. If adding clusters does not produce better results, the number of clusters is set to an optimal value k. Once the K-means clustering has determined the number of clusters k, the initial cluster centroid should be established. This is called the initialization technique, and K-means clustering usually employs the Forgy algorithm for this purpose. The Forgy algorithm tends to spread the center of gravity of each cluster from its center, because the initial cluster is set by any k point(s). Because of these characteristics, the Forgy algorithm is preferred. The EM algorithm consists of the expectation step and the maximization step, and K-means clustering operates repeatedly until the EM algorithm converges. K-means clustering employs EM algorithms because the location of each cluster center point and the cluster to which each node belongs must be found at the same time.
—Step 1: Randomly extract the locations of k nodes from the sensor space and set this node location as the center point of each cluster (set initial value).
—Step 2: Each node in the sensor space obtains the distance from each of the k cluster center points and calculates which one is the closest. The cluster is then constructed using the closest center point (configure clusters).
—Step 3: Move the center point of the cluster by calculating the location average of the nodes in the cluster, i.e., the average of the node locations in the cluster configured in Step 2 is calculated and the center point is changed.
—Step 4: Repeat Steps 2 and 3 until the center point position does not change.
The K-means clustering algorithm can obtain an objective function J, which is the sum of the distances between the nodes of the cluster and the center point of the cluster when a cluster is established. The smaller the sum, the lower is the value that can be obtained. The objective function J is given by
In
The KCE (K-means Centrality with Energy) protocol [
The advantages of the KCE protocol are that the cluster size is uniform and the nodes close to the cluster center point can be selected as CHs to increase the energy efficiency within the cluster. The disadvantage is that the optimization of the number of clusters is empirically achieved when the CHs are selected. This is because it does not take into account the transmission distance to the base station, which involves high energy consumption. Therefore, this paper proposes the KOCED protocol to overcome the above-mentioned problems.
The use of the K-means clustering algorithm in the WSN protocol makes the member nodes in each cluster as uniform as possible; however, cluster configuration takes a longer time compared to conventional hierarchical algorithms. In addition, CH selection involves the problem of successive CH selection on the same node by considering only the residual energy of the node or nodes that are close to the center point of the cluster. This reduces the life span of the entire network. Further, when the CH is selected, the distance to the base station is not considered and the number of clusters (k) is not optimized.
This paper proposes KOCED, a protocol that improves the energy efficiency of the entire network while addressing the above-mentioned problems. It involves alternative clustering for WSNs by considering the energy and distance.
First, to compensate for the shortcoming of long clustering time, the KOCED protocol limits the initial starting conditions and FND of the system. Thus, the clustering process is not performed in each round.
Second, we want to troubleshoot optimization of the number of clusters (k). Typically, k is set to the number of clusters or close to the number of sensor nodes in the K-means clustering algorithm. However, this study defines and finds the optimal number of clusters (k_opt).
Third, when selecting a CH, we consider the distance from the center point of the cluster and the residual energy of the node at the same time to overcome the problem of not considering transmission distances to the base station that involve high energy consumption. To solve these problems, the CH is selected by taking into account the residual energy of the node and the distance to the cluster center point or base station.
To compensate for the shortcoming of long clustering time, the KOCED protocol is limited to the initial start of the system and the death of the node. This can compensate for the disadvantage of longer time by not performing the clustering process in each round. The procedures for K-means clustering are as follows.
Step 1: The initial center point is designated as an arbitrary sensor node location by the optimal number of clusters (k_opt).
Step 2: Each sensor node is assigned to the nearest cluster center point. Each sensor node is assigned such that the sum (
Step 3: The center of the cluster is updated by calculating the average of the sensor nodes within each cluster. This process is shown on the right side of
The K-means clustering algorithm requires considerable computation and time to construct a cluster compared to the LEACH protocol. To address these problems, it is assumed that no new clusters are to be configured for each round. In other words, the clustering conditions are as follows:
—Perform clustering for the initial round of the sensor space
—Perform clustering only if additional dead nodes occur
Thus, clustering is not performed in each round, which can result in fewer calculations and time savings by the K-means clustering algorithm.
To solve the problem of optimizing the number of clusters (k), usually, the K-means clustering algorithm sets k to the number of clusters or close to the number of sensor nodes. However, to define and induce the optimal number of clusters,
In an
The total energy (
where
Assume that each cluster has the same number of nodes and the number of member nodes per cluster is
If the total energy
In
Substituting
When selecting CH, we solve the problem of not considering transmissions to the base station by simultaneously considering the distance from the cluster center point or base station and the residual energy of the node.
In general, when considering only the residual energy of nodes in the cluster, the node with the highest residual energy is simply selected as the candidate for the CH. However, considering the residual energy and the distance to the base station, selecting a CH by simple ranking causes energy efficiency issues.
For example, for two nodes A and B, assume that the residual energy in node A is slightly greater than that in node B and the distance to the base station of node A is greater than that of node B. In this case, owing to the high energy consumption over the distance to the base station, node A is selected as the CH. Considering only the residual energy, energy is consumed rapidly and the network life span is reduced.
By modifying the residual energy and distance ranking, the residual energy and distance Score operation for nodes within the cluster is proposed. The CH is selected by giving higher scoring operations on the member nodes that have higher residual energy than the average residual energy for all the nodes in the cluster, i.e., satisfying
The Score is divided into
Node number | |||
---|---|---|---|
1 | 0.9 | 0.9 | 0 |
2 | 0.8 | 0.7 | 0.1 |
3 | 0.9 | 0.5 | 0.4 |
4 | 0.9 | 0.4 | 0.5 |
5 | 0.7 | 0.3 | 0.4 |
6 | 0.8 | 0.6 | 0.2 |
8 | 0.7 | 0.2 | 0.5 |
9 | 0.6 | 0.1 | 0.5 |
10 | 0.5 | 0.4 | 0.1 |
The results from
For the member nodes satisfying
If a node in the cluster has a maximum Score value for the distance to both the base station and the cluster center point, it is selected as the CH. However, if a node with a value of MAX (
To predict the energy consumption of two CH candidate nodes, the sum of the distance from the member nodes in the cluster and the distance from the base station, i.e. the total transmission distance, is calculated.
where the first term is the distance (
The LEACH, KCE, and KOCED approaches are compared in
LEACH protocol | KCE protocol | Proposed (KOCED) | |
---|---|---|---|
Clustering method | Random | K-means clustering | |
Number of clusters | Optimum number |
||
Cluster point of view | At every round | First round; in case of additional dead nodes, next round | |
Cluster head selection | T(n) threshold | Nodes with high residual energy remaining close to the center point | A suitable node close to the residual energy and the center point or base station. |
Reason for selection | Because cluster heads consume a large amount of energy, select nodes with high residual energy first. | Select the appropriate node through a Score operation to solve the problem of increasing the transmission distance if energy is prioritized and reducing the energy consumption if the transmission distance is prioritized. | |
Method of selection | 1. The probability critical expression used to select the cluster head is given by T(n) threshold and it takes a value between 0 and 1. | 1. Obtain the set of nodes with the highest residual energy among the nodes2. Select the node that is closest to the center of the group | 1. Obtain a set of more nodes than the cluster average energy2. Nodes in a set perform |
To verify the energy efficiency of the proposed routing protocol, we compared it with the LEACH protocol and the KCE protocol using the MATLAB simulator.
The parameters for the performance simulation were set as follows. We assumed that after 100 sensor nodes are randomly created in a simulated environment, all the nodes are fixed and have the same initial energy value. Further, the base station is located outside the sensor field. The specific simulation parameters are defined in
Parameter | Value |
---|---|
Number of sensor nodes | 100 |
Size of sensor field | |
Location of base station | (50, 150)/(100, 300)/(200, 600) |
Initial energy of sensor node | 0.5 J |
The following are the results of timing adjustments to overcome the shortcoming of the clustering configuration time. The average clustering configuration time per round for the KCE protocol is around 22 s and that for the KOCED protocol is around 7 s. The total sum of the clustering configuration times is approximately 210 min for the KCE protocol and 75 min for the KOCED protocol.
Elapsed time | KCE protocol | KOCED protocol |
---|---|---|
Average elapsed timeper round | 22 s | 7 s |
Elapsed time of 100 rounds | 36 min | 11 min |
Elapsed time of 500 rounds | 183 min | 58 min |
Elapsed time of final round | 210 min | 75 min |
The simulation environment is
Protocol | FND | 80% Alive |
---|---|---|
LEACH protocol | 86 | 171 |
KCE protocol | 184 | 217 |
Proposed protocol | 223 | 280 |
Protocol | FNDimprovement rate (%) | 80% alive improvement rate (%) |
---|---|---|
LEACH protocol | 100 | 100 |
KCE protocol | 214 | 127 |
Proposed protocol | 259 | 164 |
The LEACH protocol considers a probability critical expression for cluster selection and formation. Its disadvantages are that it does not provide the density information for each node and it does not optimize the number of cluster heads in a specific network. The disadvantages of the K-means clustering algorithm are that it requires a long clustering formation time and it does not consider the transmission distances from the sensors to the base stations. To overcome these drawbacks, this paper proposed, KOCED, a routing protocol that improves the energy efficiency of the entire network by simultaneously considering the centrality, residual energy, and transmission distances. KOCED achieves efficiency improvement in three steps. First, by reducing the clustering configuration time, it starts cluster formation only when a sensor node is dead in the initial round. This can compensate for unnecessary clustering. Second, it determines the optimal value in a randomly distributed sensor field. This value in normal K-means clustering algorithms is set as the same number of clusters or cluster heads. However, we define the optimal number of clusters, k_opt, on the basis of