Open Access
ARTICLE
A Two-Step Algorithm to Estimate Variable Importance for Multi-State Data: An Application to COVID-19
Behnaz Alafchi1, Leili Tapak1,*, Hassan Doosti2, Christophe Chesneau3, Ghodratollah Roshanaei1
1
Department of Biostatistics, School of Public Health, Modeling of Noncommunicable Diseases Research Center, Hamadan
University of Medical Sciences, Hamadan, Iran
2
School of Mathematical and Physical Sciences, Macquarie University, Sydney, Australia
3
Department of Mathematics, LMNO, University of Caen-Normandie, Caen, France
* Corresponding Author: Leili Tapak. Email:
(This article belongs to this Special Issue: New Trends in Statistical Computing and Data Science)
Computer Modeling in Engineering & Sciences 2023, 135(3), 2047-2064. https://doi.org/10.32604/cmes.2022.022647
Received 18 March 2022; Accepted 07 July 2022; Issue published 23 November 2022
Abstract
Survival data with a multi-state structure are frequently observed in follow-up studies. An analytic approach based
on a multi-state model (MSM) should be used in longitudinal health studies in which a patient experiences a
sequence of clinical progression events. One main objective in the MSM framework is variable selection, where
attempts are made to identify the risk factors associated with the transition hazard rates or probabilities of disease
progression. The usual variable selection methods, including stepwise and penalized methods, do not provide
information about the importance of variables. In this context, we present a two-step algorithm to evaluate the
importance of variables for multi-state data. Three different machine learning approaches (random forest, gradient
boosting, and neural network) as the most widely used methods are considered to estimate the variable importance
in order to identify the factors affecting disease progression and rank these factors according to their importance.
The performance of our proposed methods is validated by simulation and applied to the COVID-19 data set.
The results revealed that the proposed two-stage method has promising performance for estimating variable
importance.
Keywords
Cite This Article
Alafchi, B., Tapak, L., Doosti, H., Chesneau, C., Roshanaei, G. (2023). A Two-Step Algorithm to Estimate Variable Importance for Multi-State Data: An Application to COVID-19.
CMES-Computer Modeling in Engineering & Sciences, 135(3), 2047–2064.