A Novel Technique for Early Detection of COVID-19

: COVID-19 is a global pandemic disease, which results from a dangerous coronavirus attack, and spreads aggressively through close contacts with infected people and artifacts. So far, there is not any prescribed line of treatment for COVID-19 patients. Measures to control the disease are very limited, partly due to the lack of knowledge about technologies which could be effectively used for early detection and control the disease. Early detection of positive cases is critical in preventing further spread, achieving the herd immu-nity, and saving lives. Unfortunately, so far we do not have effective toolkits to diagnose very early detection of the disease. Recent research findings have sug-gested that radiology images, such as X-rays, contain significant information to detect the presence of COVID-19 virus in early stages. However, to detect the presence of the disease in in very early stages from the X-ray images by the naked eye is not possible. Artificial Intelligence (AI) techniques, machine learning in particular, are known to be very helpful in accurately diagnosing many diseases from radiology images. This paper proposes an automatic technique to classify COVID-19 patients from their computerized tomography (CT) scan images. The technique is known as Advanced Inception based Recurrent Residual Convolution Neural Network (AIRRCNN), which uses machine learning techniques for classifying data. We focus on the Advanced Inception based Recurrent Residual Convolution Neural Network, because we do not find it being used in the literature. Also, we conduct principal component analysis, which is used for dimensional deduction. Experimental results of our method have demonstrated an accuracy of about 99%, which is regarded to be very efficient. Significant in the architecture is the use of IRRCNN block with its inception units, RCLs and residual units. RCLs are applied on inputs from layer passing via inception sections. The output of inception sections is added to IRRCNN input block. Recurrent operation in convolution section is operated by inception unit with different sized kernels. Present time of a step is added to outputs in this recursive operation which is again inputted into the layer. For k = 2, three RCLs are included in an AIRRCNN block where i/o dimensions do not change, and accumulate feature maps with respect to the time steps. This results in providing healthier features which have better classification accuracy. The RCL operations are performed with respect to the steps in discrete time, and are expressed based on RCNN [23]. Assuming that a sample x_l is the input in the lth layer of AIRRCNN block, pixel located at (i, j) is the sample of kth input in RCL feature map and the network output O lijk ( t ) is on step t time. The output can be expressed as Eqs. (7) and (8).

For several years computed tomography (CT) scans, also known as computerized axial tomography (CAT) scans, have been assisting physicians in their diagnosis of different strains of coronavirus [6,7]. The rapid growth of coronavirus has created shortage of physicians and radiologists. Thus it is critical to balance this widening gap by developing automatic methods for detection of the virus. Disease prognosis from CT scans is a complex and challenging task. Its complexity is increased if the patient lungs have inflammation, requiring manual visualization, which is prone to errors. Some researchers have used Machine Learning (ML) techniques to detect coronavirus from the CT scan images of patients' chests. The study in [8] has used Logistic Regression (LR) for classifying coronavirus from clinical laboratory features. Some researchers [9,10] have used Random Forest (RF) classifier for coronavirus classification from handcrafted features. Deep Learning Techniques (DLT) have also been proposed in diagnosing corona. Features of CT images were represented by using DLT in [11], which were then classified using decision trees, and Ada Boost (AB) based on DLT's learning. The study in [12] has mapped CT scans to a label space using an end-to-end network for classifying coronavirus.
In this paper we propose to use Advanced Inception based Recurrent Residual Convolution Neural Network (AIRRCNN) for classifying coronavirus from the CT scan images by preprocessing patients' images. This method uses multitude ML techniques for efficient classification of coronavirus. We shall also demonstrate the effectiveness of the proposed method.
AI techniques provide us a method of early detection of virus, and hence contribute to saving precious lives, and further spread of the virus [13]. Usage of AI and data analytical tools in early prediction create challenges for researchers [14]. Virus spread is also checked by observing social distancing [15]. There are many software startups for addressing problems of pandemic outbreaks in countries like Canada, Australia, the USA, and some European countries. In all developments and implementations, use of the AI is central in COVID-19 pandemic [16]. As we know, AI is the enabler field in developing robotic solutions, natural language processing, sensory solutions, expert solution systems, decision making tools, traffic management. Due to this extraordinary features, the AI methods are preferred in technological solutions. This includes business development, which receive many innovations due to the AI applications [17].
Unfortunately, major companies have still not invested into corona prediction related tools with AI techniques. To fill the gap, this article contributes to designing methods for detecting coronavirus automatically by Recurrent Residual CNN Algorithm, which uses AI, whose details follow.
1. First the input images are processed using multiple feature selection techniques. 2. Then the advanced recurrent based CNN is applied for detecting the infection in the X-ray images. 3. After that, recurrent operation is performed using kernels in each convolution layers.
The kernel evolution makes more accurate prediction and classification of infected images.
In Section 2, we provide literature review on detecting the diseases using X-ray images and neural networks. In Section 3, we define our research problem, in Section 4, we provide research methodology and experiments. Section 5 provides evaluation results, and the conclusions are recorded in Section 6.

Literature Review
This section describes studies on the theme of coronavirus identification. Convolutional Neural Networks (CNN) has been used in automating identification of coronavirus features for pathogenic classification by [18]. Their results showed an accuracy of 83% when applied on a dataset of above 200 cases. Another research [19] has proposed an automated framework where CNN extracted coronavirus features differentiate with those of the pneumonia. Their overall accuracy showed 96% in a dataset with 400 CT scan images. The model proposed in [12] is also useful to identify influenza-A viral pneumonia from CTs of healthy lungs with an overall demonstrated accuracy of 87% in a dataset of 618 images. A dataset of 272 images mixed with pneumonia, corona and healthy lung images was used by [20]. Their model classified coronavirus and pneumonia infections with 95% accuracy.
Coronavirus diagnostic tool using DLT was proposed in [21] where UNet pre-trained and created 3D segments from CT lung areas. DNN was then used to detect coronavirus from the segmented 3D images. Working on 499 images, the tool identified Weak Labels and achieved an ROC value of 0.959 with a Precision-Recall value of 0.976 on curves. The study in [22] investigated CTs for corona using DLTs. The tool's objective was to expedite clinical investigations of coronavirus, which demonstrated an accuracy of 89.5%, 0.88 specificity and 0.87 sensitivity when validated. When the AI and ML techniques are combined to analyse radiographic images, the results can be very efficient and accurate for coronavirus identification [23].
Authors in [24] have developed a tool with AI capabilities which may be used to predict patients at risk for more severe cases of COVID-19 on initial presentation. The predictive model is said to have achieved 70% to 80% on prediction of cases. Research in [25] has identified seven significant applications of AI for COVID-19 pandemic. Study in [26] discusses the importance of CT scan and X-ray images to detect coronavirus in early stages, and propose an automatic diagnosis system for the purpose. Different experiments have returned accuracy levels between 95 and 98.2 percent.
Authors in [27] have analysed the achievements for medical image recognition of the stateof-the-art neural networking architectures, and have proposed a model for the purpose. Their model is said to have achieved 96.73% accuracy. In [28], the authors have provided two techniques, namely Support Vector Machine, and Conventional Neural Networks, with accuracies on chest X-rays of 84% and 75%, respectively. Research in [29] have proposed a coronavirus detection method at an early stage by using artificial neural network (ANN), and support vector machine (SVM). The method is claimed to have an accuracy of 98.2 percent in experiments.
Although many proposals have been offered for early detection of the virus by means of the ML techniques, their implementations is yet to be benchmarked for reliability and complexities in detecting the coronavirus. Our research in this paper attempts to find gaps in benchmarking of coronavirus identifications. We shall provide an automated model for diagnosing coronavirus from the CTs.

Research Problem Definition
Within six months of recording incidence of coronavirus in China's Wuhan [30], there was a crunch for hospital beds and ambulances. Healthcare management could not handle the growing number of cases. Even the social distancing, mask wearing, and regular sanitization couldn't arrest the virus spread, mainly due to delayed action plan. Many patients as well as healthcare workers lost their lives due to a range of reasons (shortage of drugs, oxygen, and hospital beds). Fig. 1 shows the spread of virus in under different conditions, which indeed was maximum without protective measures.

Figure 1: Graphical representation of coronavirus treatment
Coronavirus outbreak and its subsequent spreads can also be approximated with a mathematical model using four parameters.
• Fundamental reproductive number (N o ) for representing new infection from the infected • Fatality rate (F r ) for representing people who die with symptoms • Incubation period (p) for representing duration time between infection and symptoms • Duration of the disease (x) for representing recovery or death Corona cases can be predicted using (N o ), the number of infected people to be infected by an infected person, (p), the number of infected personsAssume that one (p) produces N o , then cumulative number of cases would be 1 + N o . Two incubation periods (2P), would result in 1 + N o + N, by taking into account No 2 previous cases. Assuming that number of cases predicted on a day (day t.p ) is Q t.p , then the total cases can be expressed as Eq. (1).
In (1), Q represents the number of predicted cases, Q t.p number of incidences on day t.p , and t as the time of incubation period. Tab. 1 provides results from computations of Eq. (1) when applied with N o = 3, and p = 5. Total cases predicted (Q) To predict coronavirus related fatalities, let us assume that after value of x, the affected people are either dead or recovered then F r , the percentage of deaths, 1 − F r , the percentage of recovery, and day t.p+x , the number of deaths in a Day. Then deaths can be predicted using Eqs. (2) and (3).
In Eqs. (2) and (3), M t.p+x represents number of deaths predicted in the day t.p+x , M the total number of predicted deaths, and t the time of incubation needed. Tab. 2 lists values computed by using Eqs. (2) and (3), with N o = 3, p = 5 days, F r = 10 and x = 14.
Tab. 2 can be interpreted as follows. Day 0 cases are eliminated after 15 days. Day 0's 1 case would lead to 0.10 deaths (1% * 10%), and 0.90 recoveries from the disease. Three cases of corona on 5th day will result in 0.3 deaths and 2.7 recoveries on day 19. The number x is obtained from patients' epidemiological reports. The Fr optimal value would be attained, if N o , p and x were optimal, and can be determined with combinations of F r , N o , p. Thus x can give the predicted deaths. It can be also be inferred that coronavirus spread would be very fast in the absence of interventions. Fig. 2 represents how fast the spread could be due to lagging of early accurate prediction. We also know that all patients may not be correctly predicted due to technical error. Indeed it is necessary to predict accurately. If we mistakenly leave any COVID-19 patients out, not only they might suffer but also cause the suffering and deaths to others. So, the role of technological development in accurate prediction is very important in the diagnosing the coronavirus.

Research Methodology and Experiments
Here we shall present and describe main contributions of our research in this paper. The flow of the research in this paper is shown in Fig. 3. The research flow begins with input X-ray, which is pre-processed with fuzzy and dimensionality reduction. Then the data is fed to recurrent based artificial neural network for prediction.

Fuzzy Based Pre-Processing and Stacking
Incoherent Color (IC) algorithms play a significant role in image analysis as for accurate conceptual incoherence. The outcomes are completely based on similarity/difference functions used for separating colours and degree of uncertainty. In these techniques, input images are treated as three colours namely red, green, and blue (RGB) and single variable as an output. The inputs and outputs are produced based on training data [31]. An IC technique separates the original records into blurred windows where each image pixel has a degree of membership calculated in the window based on the distance between the pixel and the window. In the final output, weights of each window images are blurred. Images are totaled and an image output generated with respect to average values is considered for membership degree [32]. This work has used Python to recreate the original using IC technique [33]. Fig. 4 shows the sample structure of the pre-processing step of the input data image. The output of Pixel level, obtained after eliminating color intensity, is stacked, with the help of an image processing technique by combining different focal distances with the original image. Noise is eliminated by improving image quality using stacking. Two of the images, which have minimum noise level out of the pre-processed images, are compared with the original image in the same row (2nd row of the inputted image). At last, the inputted image is divided into two images as background and overlay image, where the former is processed and the latter is overlaid. After theses settings, parameter, contrast, opacity, brightness, and combined ratio of the two images are compared. The accuracies in the ratios result in accurate reduction of noises [34]. The final output is high quality image.

Principle Component Analysis Based Dimensionality Reduction
Ay image processing technique processes a large number of pixels or high dimensional data. Dimensionality reduction helps in processing important image features in a short span of time. Our method uses doc2vec technique for feature generation, where the Principle Component Analysis (PCA) reduces image features (Dimensionality Reduction) [35]. The PCA reduces dimensionality linearly, and extracts only dominant features. This results in a lower-dimensional representation of the image with maximum variance.

Feature Selection Based on Multilayered Feature Subset Selection Technique
Assuming a dataset X of size (N × M), where N is the number of samples, M the feature appearance number. The features appearance set is represented by F = {f 1 , f 2 , . . . , f M }, and the class labels set is represented by C, where C = {−1, 1}. This method uses Feature Selection based on Multilayered Feature Subset Selection Technique (MLFSST), which is a multilayered feature subset selection technique for feature selection where feature weights are calculated by selecting features into subsets and condition for termination is defined. In feature weights calculations, within a subset, certain features might be discriminative. In order to increase accuracy, their weights are increased for next subset selection. Features Weight f is on l layers, and is computed by using Eqs. (4) and (5).
In Eq. (4), ft l,m describes lth layer has mth subset, accuracy of classification in features of ft l,m on l layer, and w l−1,f is the feature f weight on the (former) layer l − 1. The subset number, which is generated on layer l , is equal to number of features M.
In (5), each subset ft l,m (1 ≤ m ≤ M) includes ls features (subset length). First layer subsets do not have duplicate features, and frequency of occurrence in all features is equal. Subsequent layers would construct M subsets using revised feature weights. Also, α and p values help to prevent the program from getting stuck in to local optimum and learning happens with data from previous layers. Eq. (6) describes the probability of f chosen on l as follows.
Eq. (3) indicates the probabilities of features have equality in selecting the 1st layer. Revising weight in feature promotes higher weights to be selected in more subsets. The multiple layers help in achieving higher performance, and computational time also increases proportionally. Terminating the algorithm is based on rate of accuracy in the topmost feature T reaching 100% else the layers still be reaching L's value. This research has considered values of 20 for T and L for a trade off in running time and stability. The MLFSST algorithm can be found in [20]. Here, we provide its modified version for our use.

Classifying COVID Data Using AIRRCNN
This research proposes AIRRCNN for improving classification accuracy in coronavirus detection. The proposed model uses fewer parameters in its computations when compared to other Deep Learning Techniques (DLTs). Residual units in this model use network of Inception-v4 described in [36]. It is a form of DLT that combines convolution operation output with different sized kernels, and improves overall accuracy of recognitions. The architecture of AIRRCNN is shown in Fig. 5, which shows that the architecture has more convolution layers, transition block, and IRRCNN blocks with output layer of softmax.

Figure 5: AIRRCNN architecture
Significant in the architecture is the use of IRRCNN block with its inception units, RCLs and residual units. RCLs are applied on inputs from layer passing via inception sections. The output of inception sections is added to IRRCNN input block. Recurrent operation in convolution section is operated by inception unit with different sized kernels. Present time of a step is added to outputs in this recursive operation which is again inputted into the layer. For k = 2, three RCLs are included in an AIRRCNN block where i/o dimensions do not change, and accumulate feature maps with respect to the time steps. This results in providing healthier features which have better classification accuracy. The RCL operations are performed with respect to the steps in discrete time, and are expressed based on RCNN [23]. Assuming that a sample x_l is the input in the lth layer of AIRRCNN block, pixel located at (i, j) is the sample of kth input in RCL feature map and the network output O l ijk (t) is on step t time. The output can be expressed as Eqs. (7) and (8).
In Eq. (7) The values w f k and w r k are weights of kth feature in layers of convolutional and RCL with bias b k .
The function f is stated as a standard Rectified Linear Unit (ReLU) for the purpose of activation. The proposed model's performance was also explored with activation of function, with Exponential Linear Unit (ELU). Inception unit outputs y for the various kernel size and average layer of pooling can be defined as y 1x1 (x), y 3x3 (x), and y P 1x1 (x). The final output of IRCNN defined as F(x l , w l ) can be expressed as Eq. (9).
In Eq. (9), concatenation operation in a channel or feature map axis. Then IRCNN outputs are added with AIRRCNN inputs at a block level, and the AIRRCNN residual operation can be expressed as shown in Eq. (10).
In Eq. (10), x l+1 inputs for next immediate transition block, xl input samples of AIRRCNN block, w l kernel weights of lth AIRRCNN block, and F(x l , w l ) are outputs of lth layer in IRCNN unit. The number of feature maps and their dimensions are same in AIRRCNN block residual unit as shown in Fig. 3. Normalization function is then applied with batch on AIRRCNN outputs [37], which is then fed as input to the next immediate transition block. In the block of the transition, many operations such as convolution, dropout, and pooling, are performed based on the network position of the block. Inception units can be avoided in smaller implementations but cannot be avoided on large scale implementations [38]. Down-samplings in the transition blocks perform operations of max-pooling of 3 × 3 patch and 2 × 2 stride. Max-pooling operation with non-overlapping function have opposite impact on model regularization. Hence the max-pooling for network regularization is used in this study [37] with overlapping strategy. Non-linearity of network features increase in late pooling operations produce feature maps in high dimensions, which are then passed via the convolution network layers.
The implementation of the model in this work used a 1 × 1 and 3 × 3 convolution filters to lessen the network parameters. Filter 1 × 1 increases the non-linearity during decision making and does not impact the convolution network layer. Moreover, as the size remains a constant in the proposed AIRRCNN blocks, the same dimension have linear projection. Further nonlinearity is activated by using RELU and ELU. The transition block uses dropout of 0.5 in each convolution layer in network. The Softmax was used in the final output. For a sample x input, weight vector (W), and distinct linear functions (K), the operation which is a normalized exponential function for the ith class can be defined as Eq. (11):

Results of Experiments
In this section, we provide details the applications of the proposed techniques MLFSST (feature selection) and AIRRCNN (classification) on COVID_CT dataset [39,40], with necessary figures and tables. The database is an internet based dataset with CTs of patients. The dataset includes a wide range of corona samples with MERS, ARDS, and SARS mixed in the samples. Experiments were conducted on fifty images, twenty five of them being normal, and the other twenty five corona positive. The resolution of images ranged between 700 and 3342 pxi. The techniques were evaluated using the performance measures of error, accuracy, precision, sensitivity, and F-measure. Fig. 6 depicts a snap of two shot of the Dataset used.  [41]. Formula for confusion matrices is provided in Tab. 4 [42].
Comparative performances of Classifiers for detecting corona from CTs images is listed in Tab. 5, which also includes the criteria of Time consumption for costs. Now we draw bar charts using the data from Tab. 5, and compare the performance metrics of the proposed AIRRCNN technique with the other techniques.    Fig. 7, we conclude that the proposed AIRRCNN classifier provides higher Classification Accuracy results of 99.1%, whereas the existing KNN algorithm provides 97%, RBF algorithm provides 97.9%, SVM (linear) algorithm provides 98.8%, Neural Network algorithm provides 98.8% and Naïve Bayes algorithm provides 97.1%. From the Fig. 8, we infer that the proposed AIRRCNN classifier provides higher Accuracy results of 98.8%, whereas the existing KNN algorithm provides 94%, RBF algorithm provides 96%, SVM (linear) algorithm provides 98.3%, Neural Network algorithm provides 98.7% and Naive Bayes algorithm provides 97.3%. From the Fig. 9, we conclude that the proposed AIRRCNN classifier provides higher F-Score value results of 0.988 value, whereas the existing KNN algorithm provides 0.939, RBF algorithm provides 0.96, SVM (linear) algorithm provides 0.98%, Neural Network algorithm provides 0.987% and Naive Bayes algorithm provides 0.973.
From the Fig. 10, it is concluded that the proposed AIRRCNN classifier provides higher Precision value results of 99.2%, whereas the existing KNN algorithm provides 94%, RBF algorithm provides 96.2%, SVM (linear) algorithm provides 98.1%, Neural Network algorithm provides 98.7% and Naive Bayes algorithm provides 97.5%. From the Fig. 11, it is evident that the proposed AIRRCNN classifier provides higher Recall value results of 0.994, whereas the existing KNN algorithm provides 0.94, RBF algorithm provides 0.96, SVM (linear) algorithm provides 0.98, Neural Network algorithm provides 0.987 and Naive Bayes algorithm provides 0.973. From the Fig. 12, it can be concluded that the proposed AIRRCNN classifier consumes lesser time of 0.10 s, whereas the existing KNN algorithm provides 0.9 s, RBF algorithm provides 0.16 s, SVM (linear) algorithm provides 0.19 s, Neural Network algorithm provides 0.55 s and Naive Bayes algorithm provides 0.14 s.  From the above comparative analysis shows that the proposed scheme is comparatively better than all the other exiting schemes. Corona is a dangerous virus, capable of killing many people, if not managed properly. With over two million fatalities, it has become a lethal global pandemic. Most of the countries are struggling to effectively contain its spread. Many of them do not have adequate number of hospital beds, ambulances, doctors, healthcare workers, recommended drugs to treat COVID-19 and its variants. Some of the countries do not even have enough diagnosis facilities and oxygen. Automating coronavirus identifications from CT images is a very good option, which can help physicians to identify the virus in the early stages of treatment. This paper has proposed, implemented and validated an automatic, implementable DLT for COVID-19 identification. The proposed schemes has been demonstrated its validity and efficiency in tracing corona. The proposed AIRRCNN classifies coronavirus infected patients from CT images with higher precision of 99.2%, compared to other existing algorithms. Moreover, its results show better performances in comparison to other ML techniques. It can be concluded that the proposed AIRRCNN is an implementable and viable technique for detecting coronavirus from CT Images.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.