TY - EJOU AU - Abdulkareem, Amjed Basil AU - Sani, Nor Samsiah AU - Sahran, Shahnorbanun AU - Alyessari, Zaid Abdi Alkareem AU - Adam, Afzan AU - Rahman, Abdul Hadi Abd AU - Abdulkarem, Basil TI - Predicting COVID-19 Based on Environmental Factors With Machine Learning T2 - Intelligent Automation \& Soft Computing PY - 2021 VL - 28 IS - 2 SN - 2326-005X AB - The coronavirus disease 2019 (COVID-19) has infected more than 50 million people in more than 100 countries, resulting in a major global impact. Many studies on the potential roles of environmental factors in the transmission of the novel COVID-19 have been published. However, the impact of environmental factors on COVID-19 remains controversial. Machine learning techniques have been used effectively in combating the COVID-19 epidemic. However, researches related to machine learning on weather conditions in spreading COVID-19 is generally lacking. Therefore, in this study, three machine learning models (Convolution Neural Network (CNN), ADtree Classifier and BayesNet) based on the confirmed cases and weather variables such as temperature, humidity, wind and precipitation are developed. This study aims to identify the best classification model to classify COVID-19 by using significant weather features chosen by Principle Component Analysis (PCA) feature selection method. The DS4C COVID-19 dataset is used to train and validate each machine learning model. Several data pre-processing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by PCA. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized CNN classifier with seven weather variables selected by PCA achieved the highest performance among all the techniques. The experimental results obtained show that the weather variables are more relevant in predicting the confirmed cases as compared to the other variables. Thus, from this result, it is evident that temperature, humidity, wind and precipitation are important features for predicting COVID-19 confirmed cases. KW - Machine learning; deep learning; classification; COVID-19; CNN; Naive Bayes; ADtree DO - 10.32604/iasc.2021.015413