Open Access iconOpen Access



Predicting COVID-19 Based on Environmental Factors With Machine Learning

Amjed Basil Abdulkareem1, Nor Samsiah Sani1,*, Shahnorbanun Sahran1, Zaid Abdi Alkareem Alyessari1, Afzan Adam1, Abdul Hadi Abd Rahman1, Abdulkarem Basil Abdulkarem2

1 Center For Artificial Intelligence Technology, Faculty of Information Science and Technology, The National University of Malaysia (UKM), Selangor, Malaysia
2 Al-Maarif University College, Ramadi, Iraq

* Corresponding Author: Nor Samsiah Sani. Email: email

(This article belongs to this Special Issue: Computational Intelligence for Internet of Medical Things and Big Data Analytics)

Intelligent Automation & Soft Computing 2021, 28(2), 305-320.


The coronavirus disease 2019 (COVID-19) has infected more than 50 million people in more than 100 countries, resulting in a major global impact. Many studies on the potential roles of environmental factors in the transmission of the novel COVID-19 have been published. However, the impact of environmental factors on COVID-19 remains controversial. Machine learning techniques have been used effectively in combating the COVID-19 epidemic. However, researches related to machine learning on weather conditions in spreading COVID-19 is generally lacking. Therefore, in this study, three machine learning models (Convolution Neural Network (CNN), ADtree Classifier and BayesNet) based on the confirmed cases and weather variables such as temperature, humidity, wind and precipitation are developed. This study aims to identify the best classification model to classify COVID-19 by using significant weather features chosen by Principle Component Analysis (PCA) feature selection method. The DS4C COVID-19 dataset is used to train and validate each machine learning model. Several data pre-processing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by PCA. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized CNN classifier with seven weather variables selected by PCA achieved the highest performance among all the techniques. The experimental results obtained show that the weather variables are more relevant in predicting the confirmed cases as compared to the other variables. Thus, from this result, it is evident that temperature, humidity, wind and precipitation are important features for predicting COVID-19 confirmed cases.


Cite This Article

A. Basil Abdulkareem, N. Samsiah Sani, S. Sahran, Z. Abdi Alkareem Alyessari, A. Adam et al., "Predicting covid-19 based on environmental factors with machine learning," Intelligent Automation & Soft Computing, vol. 28, no.2, pp. 305–320, 2021.


cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2586


  • 1609


  • 1


Share Link