As corona virus disease (COVID-19) is still an ongoing global outbreak, countries around the world continue to take precautions and measures to control the spread of the pandemic. Because of the excessive number of infected patients and the resulting deficiency of testing kits in hospitals, a rapid, reliable, and automatic detection of COVID-19 is in extreme need to curb the number of infections. By analyzing the COVID-19 chest X-ray images, a novel metaheuristic approach is proposed based on hybrid dipper throated and particle swarm optimizers. The lung region was segmented from the original chest X-ray images and augmented using various transformation operations. Furthermore, the augmented images were fed into the VGG19 deep network for feature extraction. On the other hand, a feature selection method is proposed to select the most significant features that can boost the classification results. Finally, the selected features were input into an optimized neural network for detection. The neural network is optimized using the proposed hybrid optimizer. The experimental results showed that the proposed method achieved 99.88% accuracy, outperforming the existing COVID-19 detection models. In addition, a deep statistical analysis is performed to study the performance and stability of the proposed optimizer. The results confirm the effectiveness and superiority of the proposed approach.
Corona virus disease (COVID-19) has swept the globe, resulting in millions of confirmed cases and millions of fatalities in 192 nations and territories [
Screening using Chest computed tomography (CT) performed better than PCR in terms of sensitivity and accuracy at the time of initial patient presentation. However, as the number of patients suspected of having COVID-19 increased, the limited CT capacity became even more overwhelmed. As a result, the chest X-ray (CXR) is increasingly important in identifying COVID-19 characteristics. However, the early chest X-ray results are less accurate than PCR [
According to previous publications, the field of COVID-19 detection is primarily focusing on the classification of raw chest X-ray images using classic machine learning methods. However, authors in [
There have been several achievements in COVID-19 detection thanks to the deep convolution neural network method’s unique advantage in image processing. Authors in [
It can be noted that the performance of the previously mentioned methods is satisfactory; however, practically, a significant amount of background noise (such as skeleton, lines, shoulder, etc.) is usually contained in the images of the original CXR as depicted in
In this study, we proposed a novel method for classifying COVID-19 in CXR images using a hybrid meta-heuristic between two powerful optimizers for the purposed of feature selection and the optimization of the machine learning model. The features are extracted from a deep neural network and use transfer learning. The proposed hybrid optimizer comprises the dipper throated and particle swarm optimizers. In addition, we have come up with a new feature selection method that can select the significant features automatically based on the proposed hybrid optimizer. The main contributions of this study are as follows.
A novel hybrid optimizer based on the dipper throated and the particle swarm optimizers. Efficient optimization of a neural network using the proposed hybrid optimizer. Accurate classification of Covid-19 cases using the proposed method. Comparison with other competing optimization and feature selection methods.
The following is a short description of the rest of this paper. Section 2 presents the related research efforts. The proposed method is discussed in Section 3. The discussion of the experimental is presented in Section 4. Then, Section 5 contains the conclusions.
Some of the most recent COVID-19 research calls for deep learning techniques. Most researchers have been obliged to employ transfer learning due to the novelty of COVID-19 and the corresponding lack of big data sets. The performance of latent CNN architectures employed in medical image categorization in recent years is evaluated by authors [
The problem of limited COVID-19 test kits available in public hospitals was handled by the authors in [
Authors in [
The authors of [
More studies have identified COVID-19 using CT scans rather than chest X-ray images. The CT imaging properties of this novel virus, for example, are different from those of existing kinds of viral pneumonia, according to the authors of [
Authors in [
The employed dataset of this work is collected by various researchers from Qatar University, Doha, Qatar, and the University of Dhaka, Bangladesh, along with a team of doctors. The dataset comprises chest X-ray images of covid19 positive patients, along with normal cases. The total number of images is 2850. Out of which 1468 images are positive COVID-19 cases, and 1382 are normal cases. To boost the performance of the proposed model, this dataset is augmented by randomly applying various transformation operations. The size of the dataset after augmentation is 8550 images.
To train deep learning models, an extensive database is needed, and this is a problem with the medical image datasets. To cope with this problem, data augmentation is employed to expand the size of the X-ray image datasets used during training. There are several benefits to data augmentation, including solving overfitting issues and increasing the DCNN model’s scalability. Rotating each detected patch with angles (0°, 90°, 180°, and 270°), then flipping these four images from left to right to obtain eight images for each patch, as illustrated in
An important step in pre-processing CXR images for better classification is the segmentation step. The segmentation algorithm presented in [
The segmented lung images are input into three types of deep networks trained and evaluated using transfer learning to generate a collection of deep characteristics. VGG16, ResNet-50, and InceptionV3 are the deep networks in question. The target features are taken from the model’s final pooling layer that produces the most significant results. The extracted feature set, on the other hand, may comprise characteristics that, owing to their strong correlation, may have a detrimental impact on classification performance. These features are unnecessary in this scenario and should be disregarded. The feature selection technique is used to maintain just the most relevant features that have a substantial impact on the classification results to eliminate these redundant features. The VGG16 deep network model performed best for the dataset set used in this study, and as a result, this model is used for feature extraction.
The Dipper Throated bird, renowned for its bobbing or dipping motions while perched, belongs to the Cinclidsae family of birds. The ability of a bird to dive, swim, and hunt beneath the surface sets it apart from other passerines. It can fly straight and rapidly with no stops or glides because of its small flexible wings. The Dipper Throated bird has a distinct hunting style, quick bowing motions and a white breast. It rushes headlong into the water to get its prey, regardless of how turbulent or fast-flowing it is. As it descends and picks up pebbles and stones, aquatic invertebrates, aquatic insects, and tiny fish perish. The great white shark uses its hands to move on the ocean floor. By bending your body at an angle and traveling down the bottom of the water with your head lowered, you might be able to locate prey. It can also dive into the water and submerge itself, using its wings to propel itself through the water and stay submerged for an extended period. The Dipper-Throated Optimization (DTO) method assumes that a flock of birds is swimming and looking for food. The following matrices can represent the location (P) and velocities (V) of the birds. As previously indicated, the binary DTO is used to choose features. The continuous DTO, on the other hand, is used to improve the parameters of the classification neural network.
In particle swarm optimization (PSO), possible solutions, termed particles, are flown in the search space for problems, mimicking the intelligence of bird swarms in nature. The velocity of a particle is the rate at which it changes location. The particles’ positions alter throughout time. A particle’s velocity is stochastically accelerated to its prior best location throughout the flight. To update it to a neighborhood best solution, the following equations are utilized. The location vector
Finding global optima is a difficult target to find. For the suggested method, two efficient methods are described. The PSO is the first algorithm in which individuals are moved based on their local and global optimal placements. The position of the best global individual refers to the best position found by the whole population, whereas the local best position refers to an individual’s best position thus far. Individuals in PSO can converge on their global goals thanks to this social behavior. Nature’s flock of birds and fish school has an impact on their behavior. We chose PSO for our proposed hybrid optimizer due to its simplicity, dependability, and strength. In the proposed hybrid approach, the second optimizer is DTO, a swarm-based meta-heuristic optimizer that mimics the social hierarchy and foraging behavior of dipper throated birds. The position and velocity of the bird agents affect how individuals travel in the DTO. In the proposed hybrid optimizer, the optimization process starts with a group of random individuals. Candidate solutions to the problem being solved have been suppressed by such individuals. For the initial solution and at each iteration, the fitness function is computed for all individuals. As previously mentioned, two groups represent the population, the first of which follows the PSO method and the second of which follows the DTO process. Consequently, the search space is thoroughly examined for potential spots, then exploited utilizing the strong DTO and PSO algorithms. The steps of the proposed DTPSO algorithm pseudo-code are shown in
Because the search space is confined to two binary values, 0 and 1, the difficulty with feature selection is unique. As a result, we employed sigmoid function to convert the standard optimizer’ output so that it performs effectively for this task. In this section, we show how we used the proposed hybrid DTPSO to transform traditional optimizers output. We start with an initial random vector population to select features, then calculate the fitness function. Neural network (NN) has been used for training instances, and we explain how we used the DTPSO hybrid to features that are at the pick.
The continuous output of the proposed hybrid DTPSO is converted to binary using the following equations:
The quality of each hybrid DTPSO solution is assessed using a fitness function. The classification error rate and the number of specified characteristics influence the fitness function. If the solution picked a subset of characteristics that resulted in a reduced classification error rate and a lower number of selected features, it is regarded excellent. The following equation is used to determine the quality of each solution:
The conducted experiments are presented and discussed in this section. The first step after preprocessing the dataset is to segment the regions of interest to be considered for further processing. The results are then fed to the feature extraction and feature selection to select the most significant features for further processing and classification.
The employed dataset is split into three parts of equal sizes whose contents are selected randomly from the dataset; these parts are referred to as training, test, and validation sets. During the learning phase, the NN classifier is adopted to be trained using the proposed DTPSO algorithm for classifying the CXR images. The configuration parameters of the proposed algorithm are listed in
Parameter | value |
---|---|
No. repetitions of runs | 20 |
No of iterations | 80 |
No of search agents | 10 |
Search domain | [0, 1] |
Problem dimension | Number of features |
PSO inertia factor | 0.1 |
Fitness parameter |
0.99 |
Fitness parameter |
0.01 |
To evaluate the efficiency of the proposed algorithm, several experiments were conducted to assess its performance. The results of conducted experiments are assessed using the evaluation criteria presented in
Metrics | Equation | |
---|---|---|
Average error | = | |
Best fitness | = | |
Worst fitness | = | |
Average fitness size | = | |
Mean | = | |
STD (Standard deviation) | = | |
Accuracy | = | |
Recal | = | |
Specificity | = | |
Precision | = | |
F1-score | = |
A set of seven feature selectors (namely, binary grey wolf optimizer (bGWO) [
Average error | Avg select size | Avg fitness | Best fitness | Worst fitness | STD | |
---|---|---|---|---|---|---|
bGWO | 0.9225 | 1.0581 | 0.9847 | 0.9051 | 0.9719 | 0.7955 |
bGWO_PSO | 0.9618 | 1.1914 | 0.9931 | 0.9465 | 1.0565 | 0.8137 |
bPSO | 0.9563 | 1.0581 | 0.9831 | 0.9634 | 1.0311 | 0.7949 |
bWOA | 0.9561 | 1.2215 | 0.9909 | 0.9551 | 1.0311 | 0.7971 |
bSBO | 0.9646 | 1.2284 | 1.0228 | 0.9659 | 1.0456 | 0.8558 |
bFA | 0.9547 | 1.0926 | 1.0351 | 0.9537 | 1.0513 | 0.8317 |
bGA | 0.9361 | 1.0005 | 0.9961 | 0.8994 | 1.0145 | 0.7971 |
On the other hand, the proposed DTPSO algorithm is used to optimize the parameters of NN with the target of boosting the classification accuracy of COVID-19 cases. The results achieved by the proposed approach and other approaches based on optimizing NN using different optimizers are presented in
DTPSO + NN | WOA + NN | GWO + NN | GA + NN | PSO + NN | |
---|---|---|---|---|---|
Accuracy | 0.970564837 | 0.97668557 | 0.986964618 | 0.979060555 | |
Sensitivity (TRP) | 0.961538462 | 0.963488844 | 0.975609756 | 0.964391691 | |
Specificity (TNP) | 0.996884735 | 0.998336106 | 0.998735777 | 0.998677249 | |
0.998890122 | 0.998948475 | 0.998751561 | 0.99897541 | ||
N value (NPV) | 0.898876404 | 0.943396226 | 0.975308642 | 0.95448799 | |
F1-score | 0.979858465 | 0.980898296 | 0.987045034 | 0.981378963 |
The
DTPSO + NN | WOA + NN | GWO + NN | GA + NN | PSO + NN | |
---|---|---|---|---|---|
Number of values | 19 | 19 | 19 | 19 | |
Actual median | 0.971 | 0.976 | 0.989 | 0.981 | |
Theoretical median | 0 | 0 | 0 | 0 | |
Wilcoxon signed rank test | |||||
Sum of negative ranks | 0 | 0 | 0 | 0 | |
Sum of positive ranks | 190 | 190 | 190 | 190 | |
Sum of signed ranks (W) | 190 | 190 | 190 | 190 | |
<0.0001 | <0.0001 | <0.0001 | <0.0001 | ||
**** | **** | **** | **** | ||
Exact or estimate? | Exact | Exact | Exact | Exact | |
Discrepancy | 0.971 | 0.976 | 0.989 | 0.981 | |
Significant (alpha = 0.05)? | Yes | Yes | Yes | Yes |
The statistical difference between the suggested DTPSO and the competing algorithm, on the other hand, is investigated. A one-way analysis of variance (ANOVA) test is used to carry out this analysis. The null and alternative hypotheses are the two primary hypotheses in this test. The mean values of the algorithm are made equal for the null hypothesis designated by H0 (i.e., DTPSO + NN = WOA + NN = GWO + NN = GA + NN = PSO + NN). The algorithms’ means are not comparable under the alternative hypothesis, H1.
SS | DF | MS | F (DFn, DFd) | ||
---|---|---|---|---|---|
Residual (within columns) | 0.00226 | 90 | 2.51E-05 | ||
Treatment (between columns) | 0.009131 | 4 | 0.002283 | F (4, 90) = 90.91 | |
Total | 0.01139 | 94 |
DTPSO + NN | WOA + NN | GWO + NN | GA + NN | PSO + NN | |
---|---|---|---|---|---|
Number of values | 19 | 19 | 19 | 19 | |
Mean | 0.9702 | 0.9765 | 0.9864 | 0.9792 | |
Minimum | 0.9551 | 0.956 | 0.969 | 0.9641 | |
Maximum | 0.981 | 0.986 | 0.989 | 0.981 | |
Range | 0.0259 | 0.03 | 0.02 | 0.0169 | |
25% percentile | 0.971 | 0.976 | 0.989 | 0.981 | |
Median | 0.971 | 0.976 | 0.989 | 0.981 | |
75% percentile | 0.971 | 0.976 | 0.989 | 0.981 | |
Std. error of mean | 0.001367 | 0.001425 | 0.001289 | 0.001015 | |
Std. deviation | 0.005961 | 0.006213 | 0.00562 | 0.004426 | |
Coefficient of variation | 0.6144% | 0.6362% | 0.5697% | 0.4520% |
A visual representation of the achieved results using the proposed approach is shown in
In this paper, we proposed a hybrid DTPSO optimizer utilized in conjunction with the NN classifier to choose the best subset of features for the COVID-19 classification problem. We employed DTO in the hybrid optimizer to enhance and accomplish greater quest space exploration for longer iterations, achieving a balance between exploitation and exporting. The proposed approach is used to promote population diversity and maximize production efficiency, while the PSO method is used to explore for a more significant number of iterations. A COVID-19 dataset is used to compute the consistency of the proposed optimizer and guarantee that the suggested solution is dependable and stable, allowing the quality and efficacy of the proposed solution to be evaluated. The proposed approach could improve COVID-19 diagnosis by a substantial amount. We created a deep transfer learning system that analyzes chest X-ray images from patients with COVID-19 and patients who do not have COVID-19 to diagnose the sickness automatically. With the provided classification model, COVID-19 may be identified with higher than 99.88% accuracy. We intend to test the suggested approach on increasingly complicated datasets to see how it performs in the future. Also available will be a parallel form of the DTPSO model.
Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2022R104), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia