Deep Convolutional Neural Network Approach for COVID-19 Detection

Coronavirus disease 2019 (Covid-19) is a life-threatening infectious disease caused by a newly discovered strain of the coronaviruses. As by the end of 2020, Covid-19 is still not fully understood, but like other similar viruses, the main mode of transmission or spread is believed to be through droplets from coughs and sneezes of infected persons. The accurate detection of Covid-19 cases poses some questions to scientists and physicians. The two main kinds of tests available for Covid-19 are viral tests, which tells you whether you are currently infected and antibody test, which tells if you had been infected previously. Routine Covid-19 test can take up to 2 days to complete;in reducing chances of false negative results, serial testing is used. Medical image processing by means of using Chest X-ray images and Computed Tomography (CT) can help radiologists detect the virus. This imaging approach can detect certain characteristic changes in the lung associated with Covid-19. In this paper, a deep learning model or technique based on the Convolutional Neural Network is proposed to improve the accuracy and precisely detect Covid-19 from Chest Xray scans by identifying structural abnormalities in scans or X-ray images. The entire model proposed is categorized into three stages: dataset, data pre-processing and final stage being training and classification.


Introduction
The world has witnessed since December 2019, the tragic deaths of more than two million one hundred fifty-eight thousand seven hundred sixty-one (2,158,761) people out of more than hundred million two hundred thousand one hundred seven (100,200,107) registered cases of people infected with COVID-19 [1]. Caused by a virus named SARS-CoV-2 or novel Coronavirus  belongs to the SARS-COV family [2]. Initially registered in Wuhan, Hubei province in China, the exponential growth of positive cases worldwide in a short period of time with limited testing shows the highly contagious nature of this virus [3]. It is also noted that this infection can be transmitted to humans by vertebrates such as bats [4]. The infection causes severe damage to the lungs, resulting in pneumonia with symptoms such as sore throat, dry cough, sneezing and high temperature [5]. In addition, some of the patients have no symptoms; the fact that they are carriers of the virus is of concern to the World Health Organization (WHO) who declared a global health emergency and coronavirus as a pandemic [6,7]. Given that a ton of effort is being made to find the effective cure for COVID-19, the main way of protection is social distancing and lockdown. In other hand, the lockdown influences the nation's GDP and has a negative psychological impact on the well-being of individuals.
Rapid detection of the virus is therefore essential to make quick decisions and to take care of patients. This detection is carried out by means of the RTPCR (Reverse Transcription Polymerase Chain Reaction) test but also by the analysis of Chest X-ray images and CT (Computed Tomography) images by a specialist. These different methods are time consuming and sometimes fail efficiently in early detection of patients. Artificial Intelligence and the Neural Networks provide many answers to help in the analysis and detection of patients with COVID-19. Indeed, several approaches have been used to address these problems, but the results show that a good margin of progress therefore remains to be made for the improvement and quality of detection to meet the crucial need for rapid and effective detachment of patients suffering from this disease.
The current way of detection of COVID-19 is the Reverse Transcription Polymerase Chain Reaction (RTPCR) test; lower and upper respiratory specimens like nasal, sputum or nasal aspirate are collected from the person suspected to be infected by COVID-19. But unfortunately, these tests could malfunction and affect the accuracy of the diagnosis, which is a major inconvenience. In addition, this testing process is very time consuming, expensive and the detection rate is also very low. Because of these problems, repeated testing must be performed to obtain an accurate diagnosis [8,9].
In this paper aim to propose a Convolutional Neural Network approach to reduce the time and effort required to perform CT scans and X-rays analysis of COVID-19-positive patients. This paper is divided into 7 sections. The Section 2 presents the related works; the methodology used for the proposed framework is discussed in Section 3. Section 4 exposes the experiments; Section 5 shows the experimental results and comparative analysis. Conclusion and future aspects are discussed in Section 6.

Related Works
Studies in the field of medicine have shown that the use of X-ray images can be used to diagnose patients contaminated with COVID19 [10]. But it takes a radiologist to read and analyses these images to make decisions. On the other hand, several problems of medical science have already been tackled with the help of artificial intelligence and have shown convincing results for prediction as well as image-based disease classification [11][12][13][14]. It is therefore obvious that the use of machine learning techniques could help in the decision-making process, which is crucial and must be rapid and accurate in the context of the pandemic of COVID-19.
Many works related to the Covid-19 infected patients' detection and diagnosis using Artificial Intelligence algorithms have been proposed by researchers, by either using X-rays or CT images. Wang et al. [15] proposed a deep learning-based AI diagnostic model using CT images and obtained the accuracy of 79.3%. Xu et al. [16] proposed a model that classifies normal, COVID-19, Influenza-A viralbased pneumonia cases using CT images and records the accuracy of 86.7%. These studies use nonpublic datasets for testing their AI based model. Wang et al. [17] proposed an approach using public dataset of SARS-CoV-2 X-ray images and achieved an accuracy of 92.4%. Other research has been carried out with similar accuracy using X-ray images [18][19][20][21]. Another approach using deep learning model designed with Bayes optimization was proposed by Ucar et al. [22]. To reduce data imbalance, the public dataset used were pre augmented [23].

Methodology
Convolutional Neural Network (CNN) is a technique that like other Deep learning techniques allows to reveal data features that are hidden from the original data. This technique has been effectively used on images and video and has led to great advances in the medical field. Unlike a simple Deep learning model, CNN has added other layers to explore in depth the features embedded in the data. They generally consist of Convolution, Activation, Pooling steps that can be repeated and intertwined, and a final classification step called Fully Connected Layer. Convolution steps involve the use of filters that are like small squares of input data that act as feature detectors from the original input and whose filters values will be learned on its own during the training process. The activation steps generally included between the convolution and pooling steps are steps where the values obtained from the convolution step are passed through an activation function to be corrected by amplifying the important elements and creates the activation map as its output. The pooling step, using a window, reduces the dimensionality of each feature map by eliminating the less useful information. Finally, the Fully Connected Layer steps are traditional Perceptron Multi-Layer or classification steps to determine the class of the input [24].
Our proposed deep learning-based COVID-19 detection comprises several phases, as illustrated in Fig. 1. The phases are summarized in the following four steps: Step 1: Collect the chest X-ray images COVID-19 patients and healthy persons.
Step 2: Use data augmentation to generate 3 times more chest X-ray images per class Step 3: For each class divide the data into different sets: training set (81%), validation set (9%), testing set (10%) Step 4: Evaluate the performance of the model with: accuracy, precision, recall, F-measure.

Dataset Description
For our experimentation, we mainly used the public dataset obtained from a worldwide collection by Cohen et al. [25]. This dataset consists of chest X-ray and CT images of patients which are positive or suspected of COVID-19 (Figs. 3, 4) or other viral and bacterial pneumonias (MERS, SARS, and ARDS.). Data are collected from public sources as well as through indirect collection from hospitals and physicians [26]. This dataset currently consists of around 504 X-Ray images of COVID-19 positive patients and total 866 images. The GitHub repository associated with the work collects images from websites such as radiopaedia.org, sirm.org, eurorad.org and coronacases.org. It is also open for contributions and all new images are rigorously annotated following Posterior Anterior (PA), Anterior Posterior (AP) and Anterior Posterior Supine (AP Supine) views of the lungs. Additional patient information is also provided such as: patient id, Number of days since the start of symptoms, sex, age, type of pneumonia, RT_PCR_positive, survival, if the patient was intubated , if the patient was in the ICU (intensive care unit) or CCU (critical care unit), if he needed supplemental O2, if the patient was successfully extubated, temperature, pO2 saturation, leukocyte_count, neutrophil_count, lymphocyte_count, modality (CT, X-ray, or something else), Date on which the image was acquired, 'Hospital name, city, state, country', the filename, Digital object identifier (DOI) of the research article, URL of the paper or website where the image came from, the license of the image such as

Data Pre-Processing
In AI, to the increase in the size and diversity of labelled training sets. Data augmentation is used by generating different iterations of the samples in a data set [27]. Machine learning data augmentation is used to solve problems of class imbalance, reduce overfitting in deep learning and improve convergence [28]. For this purpose, we proceeded to the resizing of the images from 2437 × 2806 × 3 in average to 224 × 224 × 3 pixel images. Also, we proceeded to the flipping of images and then applied the listed augmentation technics are both original and flipped images.

Evaluation Metrics
The results obtained by experiments are performed system were evaluated using the evaluation criteria of precision, recall, false alarm rate, true negative rate, F-measure and accuracy. Mathematical equations of each of these measurements are respectively given in Eqs. (1)- (6). Figure 3: Images X-ray without covid-19 Figure 4: Images X-ray with covid-19

Environment
All the experiments are performed in python environment running on a Linux Manjaro Mikah workstation with 16Gb of RAM, a Intel® Core™ i7-8550U CPU @ 1.80GHz × 8 as processor and GeForce MX130 2048 Mb GPU with the following tools: ImageDataGenerator Keras Tensorflow Matplotlib Pandas Scikit

Results
Experimental results reveal that the model we are proposing has good results on training, as shown in Fig. 5. Despite the results obtained shown in the Tab. 4, we believe that improvements can be made to increase accuracy by pre-cleaning the data set. Also, the confusion matrix we're getting is in the Tab. 3. Decoding the confusion matrix, out of 30 COVID affected patients we are getting 30 people we are getting 0 wrongly classified and out of 28 normal patients we are getting 28 patients are classified right and 2 as wrongly classified.

Conclusion
The contribution of this study is to develop is machine-learning model for rapid and real time detection of coronavirus infected persons. This model will work as an automated tool that can be used to assist medical professional in improving the COVID-19 diagnosis accuracy. Moreover, this method could be applied to medical problems based on image analysis. Based on the experimental results, it can be concluded that the proposed model of CNN is a much better model, due to the good results (accuracy of more than 85-95%) it produces as compared to the other models. This result validates the Deep Neural Networks application for the detection of COVID-19; results that could be much improved with the use of a larger amount of data that will not need to be artificially augmented. Our future work will be in the direction of improving the Feature Selection after the convolutional layers by using Particle Swarm Optimization (PSO) on the one hand and by diversifying our data sources on the other hand. Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.