Approach for Training Quantum Neural Network to Predict Severity of COVID-19 in Patients

Currently, COVID-19 is spreading all over the world and profoundly impacting people’s lives and economic activities. In this paper, a novel approach called the COVID-19 Quantum Neural Network (CQNN) for predicting the severity of COVID-19 in patients is proposed. It consists of two phases: In the first, the most distinct subset of features in a dataset is identified using a Quick Reduct Feature Selection (QRFS) method to improve its classification performance; and, in the second, machine learning is used to train the quantum neural network to classify the risk. It is found that patients’ serial blood counts (their numbers of lymphocytes from days 1 to 15 after admission to hospital) are associated with relapse rates and evaluations of COVID-19 infections. Accordingly, the severity of COVID-19 is classified in two categories, serious and non-serious. The experimental results indicate that the proposed CQNN’s prediction approach outperforms those of other classification algorithms and its high accuracy confirms its effectiveness.


Introduction
Towards the end of 2019, the latest coronavirus disease (COVID-19) emerged in China and spread quickly across the globe due to advanced means of transport. To date, it has claimed hundreds of thousands of lives in China and globally, and is the most powerful virus humanity has had to face since the so-called Spanish flu in 1919. It is expected that advanced technologies will help to overcome it by detecting it in real time and expediting the discovery of a possible treatment for it using supercomputers and advanced machine-learning algorithms [1].
Genomic sequencing is the process by which a lab technician analyzes a person's blood sample and prepares it to sequence a human cell which contains 23 pairs of chromosomes. This structure contains the person's DNA which is coiled in a form called a double helix that can be unwound into a ladder shape made of 6 billion paired chemical elements called bases. To read these bases, a blood sample is inserted into a sequencing instrument in which a high-frequency sound wave breaks down its DNA, each fragment of which is sequentially copied hundreds of thousands of times, with clusters of identical ones created. Powerful computers combine the individual fragments to reveal the sequence of this DNA and then a medical team can use software to analyze and compare different sequences [2]. Using such technology, the DNA of the current coronavirus was discovered very quickly and provided information that could be used to conduct PCR tests.
Taiwan suffered an outbreak of severe acute respiratory syndrome (SARS) in 2003, recording 346 cases and 37 deaths. Informed by lessons learned, Taiwan has used all its knowledge, specifically in technology, to stop the spread of COVID-19. It has integrated people's recent histories of travel to China from customer and immigration databases to complement cloud-based health records so that health professionals can be aware of patients' recent travels, determine whether they entered infected areas and display a warning if they had visited China, in particular Wuhan, in the previous 3 months [3]. Huang et al. [4] identified a recent pneumonia cluster in Wuhan, China, caused by COVID-19. They documented the radiological, laboratory, clinical and epidemiological treatments implemented and the characteristics and clinical outcomes of the patients. Lu et al. [5] considered that the genetic architecture of this human coronavirus could cause severe pneumonia and also shed light on its origin and receptor-binding properties. The outbreaks of diseases linked to COVID-19 highlight the secret reservoirs of viruses in wild animals and their capabilities to periodically spread to human populations.
Zhu [6] reported an unknown cause of pneumonia in a group of patients associated with a wholesale seafood market in Wuhan, China, with a newly discovered betacoronavirus identified by objective sequencing in samples from these patients with pneumonia. Human airway epithelial cells that have been used to extract COVID-19 generate a clade within the sub-genus sarbecovirus which is a member of the orthocoronavirinae sub-family. Although different from both SARS-CoV and MERS-CoV, COVID-19 is the seventh member of the family of coronaviruses to infect humans. Chan et al. [7] identified a family group with pneumonia associated with COVID-19 that indicated person-to-person transmission. They documented their microbiological, radiological, laboratory, clinical and epidemiological findings for 5 patients in this group who had undiagnosed pneumonia after returning to Shenzhen, Guangdong province, China, after visiting Wuhan, and an additional family member who hadn't traveled to Wuhan.
Like an artificial neural network (ANN), a new, applicable and useful concept introduced recently is the Quantum Neural Network (QNN) [8]. It integrates the quantum computational paradigm with the basics of ANN and is superior to the conventional ANN. It is used to handle big data, function approximation, computer games, etc. and its algorithms are applied in approach-automated control systems, associative memory devices, social networks, etc., Beer et al. [9] suggested a complete quantum analog of classical neurons which constructed quantum feedforward NNs capable of universal quantum computing. They defined the effective training of these networks using fidelity as a cost function which provided both classical and efficient quantum implementations. Their method involved rapid optimization with decreased storage requirements as the number of quits needed scales with only the width, thereby enabling deep-network optimization. They benchmarked their proposal for the quantum task of learning and found remarkable generalization behavior and striking robustness to noisy training data. Gao et al. [10] presented a novel method for deep learning to determine a person's state of health. Firstly, one deeplearning approach, called stacked denoising auto-encoders, found features in the raw data for retaining the original information and then inserted them into a QNN to classify the dataset, with the loss function of the QNN enhancing its classification performance. Experiments conducted on benchmark datasets showed that their proposed method was more robust and effective than traditional ones. Also, they built an integrated modular avionics degradation approach for changing the probability of occurrence of soft faults during the whole life serves. This paper presents an approach based on a QNN for predicting the severity of COVID-19 in patients. It uses the serial blood counts performed during their hospitalizations to record their lymphocytic counts from days 1 to 15 which relate to the relapse rate. This paper is organized as follows: in Section 2, the clinical characteristics of COVID-19 patients are discussed; in Section 3, training QNNs is described; in Section 4, Quick Reduct Feature Selection (QRFS) is explained; in Section 5, the proposed approach for predicting the severity of COVID-19 based on the QNN is described; in Section 6, the simulation results are presented; and, in Section 7, a conclusion is provided.

Clinical Characteristics of Datasets of COVID-19 Patients Structure
Zhang et al. [11] obtained the datasets used in this study from the National Health Commission of the People's Republic of China. They reported that 13 patients with non-serious COVID-19 previously diagnosed on admission to the First Affiliated Hospital of Xi'an Jiaotong University between January 22 and February 2, 2020 [11] were included. Tab. 1 describes the main characteristics of these datasets attributes.

Training Quantum Neural Networks
A QNN [8,9] is a computational model based on a set of artificial neuronal units, the behaviors of which are roughly comparable to those observed in the axons of neurons in biological brains. Each neuron is connected to many others, which can increase the activation function of its adjacent neurons, and it operates individually using additional functions. There can be a threshold for each connection to an actual neuron; for example, a signal must exceed a limit before it can propagate to another neuron. These systems learn and train themselves rather than being explicitly programmed, and excel in areas where detecting solutions or features are difficult using conventional programming.
As shown in Fig. 1, r is the non-linear function, h i the value of neuron i in each hidden layer, R h i ð Þ the hidden layer and y the final prediction value generated from the hybrid network.

Reducing Set of Attributes
A simple approach can be used to identify some attribute values that are not necessary for the dataset. Then, reducing the number of them in a set based on the reduced information policy is a separate process known as shorthand, whereby no more attributes can be deleted without losing information in the dataset. As discounts are minimal sub-groups that do not have any irreplaceable features, a reduction should be capable of classifying objects without changing the form of knowledge representation [12,13]. Subset A of group D is called an abbreviation if, and only if, it includes the following characteristics: x DF of subset A ¼ DF of set D The substance is necessary to represent the knowledge or rules and is the main part section of all reductions that is:

Proposed CQNN Approach for Predicting Severity of COVID-19 in Patients
Since there are no reliable risk-splitting methods for determining patients with serious COVID-19 infections, we aimed to build an effective means of identifying early cases at risk of becoming mild to severe.
The proposed CQNN approach consists of two stages; in the first, the most distinctive subset of features is selected using the QRFS method to improve its classification performance and, in the second, the QNN predicts the levels of severity of COVID-19 in patients. It does this by learning the classifications of patients with COVID-19, with the training sub-step contributing to the forecasts according to its best feature selection using a test sample. For QRFS, 24 laboratory parameters, that is, WBC, NEUT, LYMPH , NEUT%, LYMPH%, PLT, HGB, RBC, ALT, AST, ALB, TBIL, DBIL, CRP, PCT, IL-6, LYMPH0, LYMPH3, LYMPH5, LYMPH7, LYMPH9, LYMPH11, LYMPH13 and LYMPH15, and 6 clinical ones, that is, oxygen uptake, age, febrile days, temperature before admission, concomitant symptoms and epidemiological history, are recommended.
One of the criteria for diagnosing a COVID-19 infection [14,15] is a normal laboratory parameter such as 'normal/decreased number of leukocytes' or 'decreased number of lymphocytes', as shown in Tab. 2.  The analysis conducted of this dataset suggests that a lower LYMPH number is a potentially more reliable laboratory predictor of a SARS-CoV-2 infection than the recommended 'lymphocytic counts' and 'lymphopenia'. As long as the two clinical manifestations of 'fever and/or respiratory symptoms' and 'normal or reduced number of white blood cells or reduced number of lymphocytes at the onset of symptoms' are observed, individuals may be considered suspect cases according to the diagnosis and treatment of pneumonia caused by COVID-19 [16].
The training rule, which is the major aspect of the CQNN, modifies the weights to eliminate the mean square error (MSE). The general CQNN mathematical equations are as follows based on Tab. 3 parameters.
The hidden layer in the ANN CQNN uses non-linear activation functions (y(t)). The complete approach presented in the previous section is tested in terms of accuracy based on the dataset given in [11] to classify COVID-19 patients. In the experimental environment, the data are separated into 23% for testing and 77% for training using our NN structures. Fig. 2 shows the structure of the QNN. It uses 30 neurons as an input sample layer, one for each input feature in the dataset's vector, 10 in the hidden layer and 2 in the output layer, one for each class of COVID-19 (serious and non-serious). After partitioning the data into two groups (training and testing), as shown in Fig. 2, the approach is built based on the training dataset which contains 13 cases, with 0 and 1 representing non-serious and serious ones, respectively. Fig. 3 shows the Flowchart of the proposed approach.

Results
In the Covid-19 dataset, of 12 acute non-acute patients, 91.7% showed abnormal or low white blood cell (WBC) counts, 6 had lymphocytes, 2 a decrease in the number of platelets, 4 an increase in CRP and 5 an increase in IL-6. 13 patients had normal levels of PCT and 3 an increase in ALT and AST upon admission. The data used the effect of several parameters, such as blood culture, blood counts (HGB, WBC and PLT), NEUT% and LYMPH%, with the statistical correlations among them shown in Fig. 4. During the first 3 days, the number of COVID-19 lymphocytes did not change significantly but vibrated beneath normal before the disease became severe.
All patients' data were recorded within 2 days of the onset of the disease, with their numbers of lymphocytes appearing not to increase 2 weeks later which suggested that they were very likely to be incubating severe COVID-19 because a dynamic change in the number of lymphocytes early can be an important indicator of the development of serious diseases. This easily commensurable parameter can assist clinicians to identify patients at severe risk of COVID-19 very quickly. Lymphocytosis is common in seriously ill patients (with MERS-CoV or SARS-CoV) because the infestation of viral particles destroys lymphocytes. We assume that SARS-CoV affects mainly lymphocytes which can suppress the cellular immune function in the body and cause a cytokine storm which worsens the disease in some patients. Several papers have stated that lymphocytosis may be a determining factor in the severity of COVID-19. We also performed sub-group analyses of the patients with non-serious COVID-19.  In the proposed approach, the 2 patients in group one took longer than those in group 1 (means of 15 and 5 days, respectively, P = 0.05) to normalize their body temperatures, as shown in Fig. 5. These results indicated that patients with fewer than the normal number of lymphocytes in the early stages of the disease should have more severe lung lesions and, therefore, be likely to recover more slowly than those with normal numbers of lymphocytes.
The results demonstrated that a low lymphocytic level was a strong predictor of the detection of COVID-19.  The number of lymphocytes in the first diagnosis is very significant, with a gradual decrease in its percentage indicating the severity of the pathological condition.
Based on Tab. 4, the CQNN approach obtained better results for the training dataset than the logistic regression and NN approaches, and ID3.It also achieved an accuracy of 92.33% for the testing dataset, better than those of the other methods.
As shown in Tab. 5, the CQNN approach had (x) input nodes and its input samples could be defined as a vector.

Conclusion
In this paper, the CQNN method for predicting the severity of COVID-19 is presented. It identifies important features, including the serial blood counts performed during patients' stays in hospital, that is, their numbers of lymphocytes on days 1 to 15 after admission which are correlated with the rate of relapse. This proposed non-linear approach was used to assess the extent of the COVID-19 infection in   accuracy at the prediction stage. In future, we intend to investigate using the proposed method on large datasets to test its effectiveness and then automatically upload the results to a cloud-based epidemiological and early-warning monitoring platform which can process them and upload them to cloud platforms for disease surveillance and medical monitoring in institutions at all levels of government. This will provide early warnings, situation analyses and support for decision-making regarding the state of the pandemic.
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.