Medical Diagnosis Using Machine Learning: A Statistical Review

: Decision making in case of medical diagnosis is a complicated process. A large number of overlapping structures and cases, and distractions, tiredness, and limitations with the human visual system can lead to inap-propriate diagnosis. Machine learning (ML) methods have been employed to assist clinicians in overcoming these limitations and in making informed and correct decisions in disease diagnosis. Many academic papers involving the use of machine learning for disease diagnosis have been increasingly getting published. Hence, to determine the use of ML to improve the diagnosis in varied medical disciplines, a systematic review is conducted in this study. To carry out the review, six different databases are selected. Inclusion and exclusion criteria are employed to limit the research. Further, the eligible articles are classified depending on publication year, authors, type of articles, research objective, inputs and outputs, problem and research gaps, and findings and results. Then the selected articles are analyzed to show the impact of ML methods in improving the disease diagnosis. The findings of this study show the most used ML methods and the most common diseases that are focused on by researchers. It also shows the increase in use of machine learning for disease diagnosis over the years. These results will help in focusing on those areas which are neglected and also to determine various ways in which ML methods could be employed to achieve desirable results.

forecasting results, and offering an informative mechanism [1]. Appropriate and effective treatment usually involves a thorough diagnosis. The accompanying improvements in diagnostic testing and imaging have certainly improved the entire process of diagnosis. But the human method of scienti c judgment leading to correct diagnosis remains key to superior quality and healthy medical services even in this era of rapid technical transition [2]. However, the diagnostic error that harms patient does happen frequently. Generally, multiple factors give rise to diagnostic errors, usually including both perceptual and system-related causes. Certain common factors involve misjudging the signi cance of observations, misinterpretation, errors originating from heuristics usage, and errors in judgment, particularly when diagnostic hypotheses are developed and assessed [3][4][5]. Since treatment options are becoming ef cient and expensive, the health and nancial risk of misdiagnosing an easily curable illness is signi cantly greater. Thus, there is a loss in improved patient care [6].
These diagnostic errors could be minimized using techniques like fuzzy logic [7], or machine learning (ML) and thus could improve healthcare services. The kind of analytics a clinician can get using ML, at the time of patient treatment, can provide them with more knowledge and, thus, better care [8]. ML tackles the concern of how these systems can be designed that develop with experience continuously. It is known as one of the fastest-growing technical disciplines of today, standing at the junction of computing and analytics and at the heart of arti cial intelligence (AI) and data science [9]. Till today, the primary winners of the 21st-century boom in the development of big data, ML, and data science are markets that have been able to obtain such data and employ the workers needed to turn their products. The algorithms built in and around these markets provide considerable potential around improving research in medical and clinical care, particularly provided that clinicians are widely using electronic health records (EHR). Diagnosis and outcome estimation are two elds that gain from the use of ML techniques in the healthcare sector [10]. ML can not only handle varying raw data combinations and apply context weighting but also measure the predictive capacity of any possible combination of factors for determining diagnostic and prognostic components [11]. For example, assisting clinicians for 'second opinion,' as based on clinical data, ML models can diagnose aphasia speech type [12], urinary tract infection [13], or even predicting breast cancer [14], among others. The capability to process large data sets far beyond limits of human abilities, and then to ef ciently process that data into clinical knowledge that enables doctors to prepare and deliver treatment, eventually leading to improved results, lower medical costs, and enhanced patient satisfaction. ML has the capability and is currently behind the creation of guidelines for precision medicines, treatment counsel, and disease diagnosis [8]. Utilization of these capabilities of ML can even be seen in healthcare internet of things (H-IoT); to analyze and process massive amount of healthcare data generated through sensors [15]. Therefore, extensive research in the context of treatment for speci c diseases has been conducted for its usefulness. Hence, the main aim of this paper is to analyze the experiments in which ML approaches are used in relation to different medical elds and diseases to determine their pattern and usefulness in the diagnosis of disease, through a systematic analysis. Tab. 1 represents the uniqueness of our paper using comparative analysis with other published review papers in the medical domain. This paper provides in-depth analysis and results of the use of ML in disease diagnosis. This research paper provides detailed analysis, covering all the major medical domains to the best of our knowledge.

Comparison with Related Survey Articles
Yanase et al. [16] presented a survey from a computer-aided diagnostic (CAD) system perspective in medicine. The article covers the in-depth work ow of CAD systems and their history. The paper also represents applications in the medical domain from a data type perspective, including tabular, imaging, sound, and signal types of data. Caballe et al. [17] detailed the bene ts and limitations of using different ML methods in disease diagnosis. The paper covers classication, regression, and clustering techniques. However, it does not include a summarization of literature and an in-depth analysis of reviewed articles. Jiang et al. [18] surveyed research articles in healthcare from AI perspective. In addition to ML, the paper also covers natural language processing techniques applied in healthcare. The paper covers only three medical domains: cancer, neurology, and cardiology. Schaefer [19] presented an overview of the application of ML in rare diseases. It reviews articles in healthcare covering diagnosis, prognosis, and treatment. None of the articles summarizes the existing work. Also, they cover only a few medical domains and do not provide an in-depth analysis of reviewed articles.

The Process of Applying Machine Learning in Disease Diagnosis
Medical diagnosis is a complex task largely considered as an empirical task but understood poorly as a cognitive task [20]. Thus, as complex as it may be seen, diagnosis using a computer, i.e., using ML in our case, is divided into multiple steps. The rst step of disease diagnosis is data acquisition. This data could be in varied forms, including but not limited to medical interview, clinical, demographic, imaging, speech, patient historical data, or even heart sound [21][22][23]. The next step involves processing. In this, the data is prepared, i.e., missing values, dimensionality reduction, dealing with noisy data, and so on is made in this step [24,25]. Next, the target variable and the predictors are identi ed. This data is then fed to one of the models for training. Once the model is trained, it is then used for diagnosis.

The Bene ts of Using Machine Learning for Disease Diagnosis
Limitations posed due to a large number of overlapping structures and cases, and distractions, tiredness, and limitations with the human visual system, provision of 'second opinion' can come handy [26]. This has encouraged the use of CAD systems for diagnostic processes. CAD is a concept that gives equal roles to physicians as well as to the computers, i.e., it assists the physicians in taking the best clinical decisions/practices [27]. Moreover, due to increasing complexity among patients, high diagnostic errors, and availability of a large amount of data, EHR systems are being used to assist in making the clinical decision [28].
With the availability of intelligent tools for data analysis, ML methods help in demystifying interesting relationships in the data [29]. As a second opinion, it could corroborate with clinicians' decisions or refute it [30]. Integration of ML based tools that monitor continuously increasing volume of data streams for patterns, assisting in decision making for clinicians, or automatically adjusting settings of bedside devices have improved outcomes of patient treatment and substantially reduced the overall cost of treatment [31,32]. On the downside, ML promises to provide the best clinical assistance but so far has not proven useful, according to the article [33,34], probably due to opacity in ML algorithms and analytics. Moreover, data quality and generalizability of the ML models remain amongst the other problems [35,36].

Article Structure
The paper is organized as follows. Section 2 proposes the methodology employed to carry out this study. It discusses the database chosen and eligibility criteria for the selection of papers. In Section 3, we present the analysis and synthesis of the eligible papers. A discussion of the analysis done is discussed in Section 4. Finally, we draw a conclusion. Fig. 1 presents the taxonomy of this article. Abbreviations and their corresponding full forms used in this article are presented in Tab. 2.

Research Methodology
Methodology, in which the author nds relevant studies, selects and investigates those studies, analyzes the data, and summarizes the ndings to reach precise conclusions, is called systematic review [37,38]. The use of evidence from dependable research to make healthcare decisions facilitates the use of best practices with lesser mistakes for clinical decision making. Hence, systematic reviews, as well as clinical practice, are considered as the nest source of evidence [39]. The following section includes literature search, study selection, and eligible papers, and extraction and analyzation of data.

Literature Search
To select relevant and eligible papers for systematic review, six databases were selected in this step.  These databases were: IEEE, PubMed, Science Direct, SciPub, Springer Link, and Web of Science. The articles searched were from the year 2015 up to now. Phrases and keywords such as "disease diagnosis," "disease diagnosis using machine learning," "Chronic kidney diagnosis using machine learning," "Parkinson diagnosis using machine learning," etc. were used to nd relevant articles. The articles were ltered based on relevancy and publication date. From our eligible papers selected, the frequency and number of articles published by publishers are shown in Tab. 3. Accordingly, with 22.73% Elsevier had the highest number of publications. BMC, Hindawi, IEEE, Public Library of Science, and Springer stood second with 6.82% of publications. Nature was ranked third with 4.55% of publications. In comparison, the rest of the publishers ranked fourth with 2.27% of publications each.

Study Selection and Eligible Papers
Inclusion and exclusion criteria were used to select appropriate and relevant articles. The research only concentrates on disease diagnosis using ML. It excludes paper using fuzzy logic or image processing. Accordingly, articles were screened for selection based on their title and abstract. Only journal and conference papers were considered. Books, book chapters, thesis, reports, review articles, and letters to editors were thus excluded from our research. Language, time and article qualities were considered for eligible papers. Thus we selected papers written only in the English language and published from the year 2015 up to now. Our research was focused on including all kinds of medical disciplines. However, diseases related to animals and plants were excluded from it. According to our inclusion criteria, articles using methods and techniques that improved the accuracy of disease diagnosis were included.

Extraction and Analyzation of Data
The included articles were examined to extract and analyze the data with respect to our research objectives. Thus, to meet our objectives, we analyzed the articles according to frequency of articles over the past years, type of academic papers, according to database providers, and according to ML model employed in those articles.

Results
The following section represents the ndings and results of the analysis and synthesis of the included articles. This result, which is the outcome of a systematic study of the papers, shows the ef ciency of applying ML in disease diagnosis. In the following section, the impact of ML and its use in different medical disciplines is studied.

The Frequency of Published Articles over the Past Years
Our research includes 44 academic papers that met our inclusion criteria. These 44 papers include research papers as well as conference papers. The frequency of published articles is shown in Fig. 2. The articles included are taken from the year 2015 up till now. The graph indicates that since 2015 there has been a signi cant rise in published articles. This shows that the research for disease diagnosis using ML has been increasing. In fact, from the included articles, almost 40% were published in the year 2019. Hence, it is evident that researchers are showing interest in applying ML techniques in disease diagnosis.

Distribution of Academic Papers by Journal and Conference Type
Total articles included in this systematic review, including from journals and conferences, are 35 (i.e., 27 from journals and 8 from conference papers). Fig. 3 represents the distribution of papers by publication year and type. As seen in this gure, journal articles published are comparatively higher than conference papers. As we nd no conference paper in the year 2015 and 2016 in our chart, we can say that overall fewer articles must have been published in conferences than in journals. However, during 2017 there is a considerable increase in articles published in conferences.
Eligible articles have been categorized by journals and conferences. The distribution of papers by journals is represented in Tab. 4 and by conferences in Tab. 5. From our reviewed articles, almost 81.82% of articles are of journals, and 18.18% of articles are from conferences. 'Computer Methods and Programs in Biomedicine,' 'Computers in Biology and Medicine,' 'IEEE Access' and 'PLoS ONE' journals published 6 articles each, which were the highest and concentrated of around 6.82% each. On average, 2.27% of articles were published by each journal.

Distribution of Papers by Database Providers
Searching and selection of papers were made through 6 different databases. These databases and their contribution can be seen in Tab. 6. With 40.91% PubMed was ranked rst. It concentrated on 18 papers. Moreover, IEEE was ranked second with 25.00%. Science Direct was ranked third with 22.73%. Springer Link and Web of Science were ranked fourth with 4.55% each. And SciPub was ranked fth with 2.27%.

The Distribution of Machine Learning Methods Applied in Published Articles
The objective of this study is to carry out a systematic review of the use of ML methods in disease diagnosis. Also, the distribution of various ML methods for diagnosis is analyzed. From these eligible articles, we can observe that some of these articles incorporated ML to improve the disease diagnosis. Hence, we categorized selected papers into 12 different ML methods, as can be seen in Tab. 7. Among the researchers, support vector machine (SVM) has been ranked one with 22.73%. This shows the ef ciency of the SVM method to improve the diagnostic process of diseases. Convolution neural network (CNN) method was ranked second with 15.91%. Other methods, which included proprietary algorithms using ML methods and a combination of various ML methods, ranked third with 13.64%. With 11.36%, random forest (RF) was ranked fourth. Arti cial neural network (ANN), deep ANN, and eXtreme gradient boosting (XGBoost) were ranked fth with 6.82%. The classi cation and regression trees (CART) method was ranked sixth with 4.55%. However, bayesian classi er (BC), decision tree (DT), and gradient boosting (GB) were ranked last at 2.27%. This shows that these three ML methods are least preferred in improving the disease diagnosis process.  We have summarized the distribution of ML methods by year in Fig. 4. Accordingly, we observe that the use of hybrid methods has been increasing over the years (1 in 2015, 2 in 2017, and 4 in 2019) to improve the accuracy of the ML models. From the year 2016, we observe that SVM has always been used over the years, with the highest of 3 articles each in the year 2017 and 2019. This shows its popularity over the years among the researchers. Furthermore, with one article each 2018, 2019, and 2020 XGBoost has shown its consistent use to improve the diagnosis. Also, we observe an increase in the use of the CNN method from 2017.

Distribution of ML Methods Applied in Published Articles Based on Clinical Aspects
In the context of disease diagnosis using ML, we would like to know which diseases were considered more. Moreover, in which medical disciplines were researchers more interested is one of the objectives of this research. For this reason, the eligible articles in this research were classi ed by diseases and the implementation of ML methods. To better understand the distribution of ML for disease diagnosis, we analyzed the articles based on medical disciplines. Fig. 5 represents the pie chart for the frequency of medical disciplines. Based on the diseases, 18 medical disciplines were identi ed. From Fig. 5, it is observed that 13.64% of studies were carried out in cardiology and endocrinology. Probably due to the large number of diseases associated with them. Infectious disease, oncology, and pulmonology were ranked second with a 9.09%. With a 6.82%, dermatology and nephrology were ranked third. Neurology, rheumatology, and urology were ranked fourth with 4.55%. At last were ranked critical care, gastroenterology, hepatology, ophthalmology, pediatrics, periodontology, vascular surgery, and virology with 2.27% each.

Discussion
We conducted this study to review the impact of ML in disease diagnosis. As per our knowledge, fewer articles have been published that systematically analyze academic articles using ML for disease diagnosis. Hence, the results and analysis of this study can be considered to assess the impact of ML in the medical domain and its ef ciency in improving the disease diagnosis. This study considered the articles from the year 2015 to 2020. We identi ed 44 articles applying ML methods to improve disease diagnosis over this period. One of the objectives of this study was to determine which ML methods were used most by researchers for diagnosis, as the answer to this question determines the ef ciency of the methods. Hence, the articles were classi ed accordingly. One of the ways in which articles were classi ed was based on the number of articles published each year. According to this classi cation, we observed that the number of publications using ML for disease diagnosis has been rising over the years. We nd that 4.55% of articles were published in 2015, whereas in 2019, 40.91% of articles were published. This article was written in mid-2020. Thus, we were able to retrieve a few articles from this year. This increase in the use of ML methods is due to its ef ciency in improving the accuracy and sensitivity of models to give correct results. We identi ed 12 different ML methods that were applied in our eligible papers. Although we say that these 12 methods are mostly used ML methods for disease diagnosis, we limit our ndings only to medical diagnosis and do not generalize it. From our analysis, as presented in Fig. 6, we nd that researchers prefer SVM, CNN, and RF over other ML methods. However, there is also an increase in the use of hybrid/other methods. This is mainly because using the combination of various methods augments the ef ciency of the model. Our study also examined the articles from a medical discipline point of view, i.e., we classi ed the eligible articles according to medical disciplines. This classi cation helped us understand which medical disciplines were chosen largely. From this study, it was evident that cardiology and endocrinology had the highest number of publications. This must be due to the fact that most of the diseases come under these two disciplines and also because of the easily available large amount of data to carry out the research. Moreover, going only by diseases explored, we nd that variety of ML has been applied to a variety of diseases. This shows the effectiveness of ML in improving the accuracy of disease diagnosis. Thus, we could apply ML in any medical discipline and get the best results.
The ndings of this investigation show which diseases and medical disciplines are mostly targeted by researchers and which get neglected. We also nd the ef ciency of ML methods in disease diagnosis. Therefore, this study could assist researchers in carrying out further work in the medical domain.

Conclusion
The main goal of this systematic study was to review the articles using ML for disease diagnosis and, thus, the competence of ML in improving the diagnosis of disease. For the same, we retrieved articles from year 2015 to 2020. We identi ed six databases including IEEE, PubMed, Science Direct, SciPub, Springer Link, and Web of Science. Further, we classi ed the articles based on publisher and database. Through this study, we found which databases and publishers are publishing the greatest number of articles relating to ML in disease diagnosis. We also investigated the most used ML methods and their impact on disease diagnosis. Thus, we nd that all the studies have shown improvement in their results. We nd that using ML not only reduces the overall cost of the treatment and assist clinicians as 'second opinion,' but also helps in early detection of diseases having complex structures and patterns. We also identi ed 12 mostly used ML methods in disease diagnosis and their effectiveness in improving the results. We also investigated the medical disciplines using ML to a large extent. Different ML methods were analyzed to understand their effectiveness in improving disease diagnosis.
Whatsoever, this study has certain limitations. The rst limitation is that this systematic review was carried on from year the 2015 to 2020, i.e., for a xed duration. Also, it has to be noted that this study was carried out up till mid of 2020. But still, through our results, we nd that there is growing acceptance and adoption of ML in disease diagnosis over the years. The second limitation of our study is that we did not include articles using fuzzy logic or image processing entirely. In the future, we can include these techniques to get a generalized view and idea of the impact of each of these techniques in disease diagnosis. The third limitation of our study is that our investigation focused solely on the diagnosis of diseases. We did not include articles relating to prognosis or treatment path. In the future, the researchers can investigate the articles to study the impact of ML in prognosis as well as for treatment path.
This study could provide basic knowledge for future studies. We excluded the articles written in other languages and articles other than journals and conference papers. Thus, in the future we can consider neglected resources for investigation as studies of these resources could be valuable. Moreover, we could also identify and diagnose relationship among multiple diseases and diagnose them simultaneously to bene t patients suffering from multiple diseases, investigate with more parameters when building ML models, appropriate selection of models could decrease the time of implementation, e.g., CNN works better for image data, standardization of data for unbiased results, using deep learning, and ensemble models for better results.