SutteARIMA: A Novel Method for Forecasting the Infant Mortality Rate in Indonesia

: This study focuses on the novel forecasting method (SutteARIMA) and its application in predicting Infant Mortality Rate data in Indonesia. It undertakes a comparison of the most popular and widely used four forecasting methods: ARIMA, Neural Networks Time Series (NNAR), Holt-Winters, and SutteARIMA. The data used were obtained from the website of the World Bank. The data consisted of the annual infant mortality rate (per 1000 live births) from 1991 to 2019. To determine a suitable and best method for predicting Infant Mortality rate, the forecasting results of these four methods were compared based on the mean absolute percentage error (MAPE) and mean squared error (MSE). The results of the study showed that the accuracy level of SutteARIMA method (MAPE: 0.83% and MSE: 0.046) in predicting Infant Mortality rate in Indonesia was smaller than the other three forecasting methods, specifically the ARIMA (0.2.2) with a MAPE of 1.21% and a MSE of 0.146; the NNAR with a MAPE of 7.95% and a MSE of 3.90; and the Holt-Winters with a MAPE of 1.03% and a MSE: of 0.083.


Introduction
In this era of globalization and continuous industrial development, every human being wants to get information as fast as possible. Statistics, which is one of the fields of science related to the acquisition of information in several scientific disciplines, has made progress. This advancement usually requires different methods of solving different problems. Statistics has been known for a long time and has even been used in dealing with problems in everyday life such as in the fields of health, economics, social sciences, atmospheric sciences, and other fields. In addition, the development of data mining and big data analysis also requires an understanding of statistics. This is in line with the opinion of Sivarajah, Kamal, Irani, and Weerakkody [1] in their presentation depicted in Fig. 1. Fig. 1 shows that the types of classification of big data analytical methods, especially in the descriptive analytical, inquisitive and predictive analytical sections require statistical analysis to obtain information. Furthermore, Grover & Mehra [2] stated that data mining is the application of statistics in the form of exploratory analysis and modeling of data to obtain shapes and trends from large data sets.

Figure 1:
Classification of big data analytical method types [1] Statistics are usually used by data analysts to consider possible events that may recur. Therefore, the likelihood of future events is strongly influenced by the frequency and routine of events that have occurred in the past. This is in line with the opinion of Edwards [3] who states that predictive analysis is data analysis that aims to make predictions about future events based on historical data and analysis techniques. Based on this, it can be said that statistics has a connection with events in the past that may recur in the future. The statistical method that is often used to obtain future information is known as forecasting or predictive analysis. Predictive analysis is often employed in economics, finance, health, and other fields [4,5].
In the health sector, forecasting is often used as a means of evaluating the implementation, success and failure of a health program or health service that is being implemented. In addition, forecasting is also often used as a means of planning and decision making in the implementation of future activities. For example, Ranapurwala [6] used predictive modeling in the field of public health, namely agricultural vehicle accidents and concluded that forecasting or predictive data will be able to assist health policy makers (government, doctors, and health practitioners) in making decisions in an effort to improve public health; El Safty [7] and El Safty et al. [8] used modeling in corona virus topic using topological method. On the other hand, forecasting methods can also be applied to several topics in the health sector, such as birth and death rates. The incidence of birth and death in an area is commonly used as an indicator in assessing the success of health services and health development programs in an area.
The infant mortality rate is one of the health problems in Indonesia that needs to be highlighted, because the infant birth rate is one of the indicators commonly used in determining public health. It is not surprising that health programs in Indonesia focus a lot on the problem of infant mortality, namely the reduction in infant mortality rates. In 2008, the Infant Mortality Rate in Indonesia was still quite high, around 31/1000, or in other words, 31 babies died in every 1,000 births. This mortality rate is higher when compared to Malaysia and Singapore, which amounted to 16.39/1000 and 2.3/1000 live births, respectively.
According to WHO data, in 2019, globally, as many as 7000 newborns died every day and 185 cases per day occurred in Indonesia with an infant mortality rate of 24 per 1000 live births, with details of 75% of neonatal deaths occurring in the first week, and 40% died within the first 24 h [9]. Given the importance of this infant mortality rate and to achieve one of the targets of the sustainable developmental goals (SDGs) in the health sector of the Republic of Indonesia, namely by 2030, to end preventable deaths of newborns and children under five, with all countries trying to reduce the Neonatal Mortality Rate at least up to 12 per 1000 KH (Live Birth) and the under-five mortality rate of 25 per 1000, a suitable statistical method is needed in order to provide information in the future to minimize infant mortality. One of the statistical methods that is suitable for this problem of infant mortality is the method of prediction. Due to a decreasing trend from year to year, the infant mortality rate in Indonesia is assumed to meet the trend pattern. Forecasting methods that are suitable for the trend method are the ARIMA [10], Neural Networks [11], Holt-Winters [11], and SutteARIMA [12]. SutteARIMA is used in this study because the SutteARIMA method is a new forecasting method that has a good level of accuracy in some forecasting data [12].

ARIMA Method
The Autoregressive Integrate Moving Average (ARIMA) model was first discovered and presented by George Box and Gwilym Jenkins in 1976, and their names are often synonymous and associated with the ARIMA process applied for time series analysis, namely ARIMA Box-Jenkins. In general, the ARIMA model is written with the ARIMA notation (p, d, q), where p represents the order of the autoregressive process (AR), d represents the differencing, and q represents the order of the moving average (MA) process.

AR Process
The autoregressive model is a form of regression that connects the observed values at a certain time with the values of previous observations at certain intervals [13].
In general, the autoregressive process of data at the p level (AR (p)) [14]: This equation can be simplified into |φ j | < ∞ then this process is always invertible, for stationary, then the root of φ P (B) = 0 must be outside the unit circle.

MA Process
The moving average process is a process that functions to describe phenomena in which the event produces an immediate effect which only lasts for a short period of time. The model of the general process moving average (MA) is as follows [13]: With: then the finite moving average process is always stationary. This moving average process will be declared invertible if the root from θ Q (B) = 0 is outside of the circle.

ARMA Process
The model of the moving average autoregressive process (ARMA) [13]: and For an invertible process, it is required that the root of θ Q (B) = 0 outside of the unit circle. And in order for it to be stationary, it is necessary that the root of φ P (B) = 0 it outside of the unit circle. It is also assumed φ P (B) = 0 and θ Q (B) = 0 don't have the same root. Furthermore, this process is referred to as the ARMA (p, q) model or process, where p and q are used to denote the respective order of polynomial values associated with autoregressive and moving averages.

ARIMA Process
The ARIMA process is basically similar to the ARMA process, they state that stationary and invertible processes can be represented in the form of a moving average or in an autoregressive form in the ARMA section. AR, MA, and ARMA require that data must be stationary, both in mean and in variance. Data can be stated as stationary in terms of average, if the time series data is relatively constant over time, it is stated to be stationary in variance, if the time series data structure from time to time has constant or constant data fluctuations and does not change or does not change the variance in the magnitude of the fluctuation. To overcome this nonstationary mean, a differencing process is carried out, and for non-stationary variants, a power transformation is carried out. (λ). In the ARIMA modeling process, the variance stationarity process is carried out first then the average stationarity. From this stationary process comes the ARIMA process.
This ARIMA contains a differencing process to stationary data that is not stationary in the mean in the ARMA process. If there is a d-order differencing, then to achieve a stationary and general model of the ARIMA process (0, d, 0) it becomes: and from this equation, we can form a general model of the ARIMA process (p, d, q): With the AR stationary operator φ P (B)

The Stages in Forecasting the ARIMA(p, d, q)
In the execution of time Series data forecasting, the ARIMA method (p, d, q) have steps or stages. The stages in forecasting are as follows [14].

1) Model Identification
Model identification is done to see the meaning of autocorrelation and data stationarity, to determine whether or not it is necessary to carry out a transformation or a differencing process (differentiation). From this stage, a temporary model will be obtained from which the process of testing the model will be carried out whether it is appropriate or not on the data.

2) Model Assessment and Testing
After the model identification process has been carried out, the next step is to assess and test the model. This stage consists of two parts, namely parameter assessment and model diagnostic examination.

a) Parameter Assessment
After obtaining one or more provisional models, the next step is to find estimates for the parameters in that model.

b) Model Diagnostic
Diagnostic checking is done to check whether the estimated model is quite suitable or adequate with the existing data. Diagnostic checking is based on residual analysis. The basic assumption of the ARIMA model is that the residual is an independent random variable with a normal distribution with a constant mean of zero variance.
(1) Independent Test This independent test is performed using the Box-Pierce Q statistical test. The Box-Pierce Q test can be calculated using the formula [14]: where: n = amount of data ρ k = autocorrelation for lag k, k = 1, 2, . . ., m If the value is Q < χ 2 m−p−q , it is considered that the model is adequate, and vice versa, if the value is Q > χ 2 m−p−q , it is considered inadequate. (2) Normality Test Residual analysis is used to examine whether the residuals of the model are white noise or not. White noise is the basic assumption of the ARIMA model where the residual in this case is a free random variable that is normally distributed with zero mean and constant variance.

Holt-Winters Method
Holt-Winters is a method for modeling and predicting the behavior of data from a time series. In addition, Holt-Winters is one of the most used time series forecasting methods. It is decades old, but is still widely used in a variety of applications, including monitoring, which is used for things like anomaly detection and capacity planning. The Holt-Winters model uses three aspects of the time series: a typical value (average)/stationary, trend, and seasonality. Because it uses these three aspects, Holt-Winters is also known as triple exponential smoothing. Holt-Winters uses three smoothing parameters, namely α, β, γ , each of which has a value between 0 -1.

Neural Network
An artificial neural network (ANN) is a system that processes information with characteristics and performance close to that of a biological neural network. Artificial neural networks are a generalization of biological neural network modeling with several assumptions, including: Neural networks are useful for estimating or regression analysis including for forecasting and modeling, classification including pattern recognition and sequence recognition, as well as for decision making in sorting and processing data including filtering, grouping, and compression as well as programming of robots that move independently without human assistance. According to Wuryandari et al. [16], an artificial neural network model is determined by: (a) Patterns of relationships between neurons (network architecture) (b) The method for determining and changing the joint weights is called the training method or network learning process (c) Activation function Artificial neural networks are also known as brain metaphors, computational neuron science, and parallel distributed processing. Neural networks are used for complex non-linear forecasting. One of the network requirements related to the Time Series is NNAR (Neural Network Autoregressive). Time series lag values can be used as input to neural networks, such as the lag values used in linear autoregressive models. This method is known as the neural network autoregressive model (NNAR). The NNAR model is generally denoted by NNAR (p, k) where p = input lag and k = number of hidden layers and NNAR (p, P, k) is the general denotation for NNAR in seasonality. For example, the NNAR (4, 3) is a neural network that has four observational data (y t-1 , y t-2 , . . ., y t-4 ) which serve as input data used to predict the outcome or value of forecasting (Y t ), and is accompanied by three neurons in a hidden layer.
The NNAR model is a feed-forward neural network that involves a combination of linear and activation functions. This function formulation is defined as:

SutteARIMA Method
SutteARIMA is a short-term forecasting method developed by Ahmar et al in 2019 [17]. This method is a hybrid method between α-Sutte Indicator and ARIMA.

Dataset
In this paper, we use annual time series data from Mortality rate, infant (per 1,000 live births) for Indonesia which is obtained from the World Bank Database. Data for this paper is available at: https://data.worldbank.org/indicator/SP.DYN.IMRT.IN?locations=ID. The World Bank website contains different annual time series at various levels of aggregation, from 1960-2019. In conducting data analysis, the data is divided into two parts, namely training data (from 1960-2012) and fitting/testing data (from 2013-2019). Training data is used to obtain forecasting models and fitting data is used to see the level of accuracy of the forecasting models obtained in the training data. Data were analyzed using forecasting methods: ARIMA, Neural Networks Time Series, Holt-Winters, and SutteARIMA method. To simplify the analysis, we used the R Software version 3.6.3, namely the SutteForecastR package and Microsoft Excel 2010.

Forecast Accuracy
In the results of the fitting/testing data, two performance indicators or forecasting accuracy are used to assess the quality with the good of fit standard and the accuracy of the forecasting results obtained. The indicators are as follows [18].
-Mean Absolute Percentage Error (MAPE) -Mean Square Error (MSE) where: A t = Actual values at data time t.
F t = Forecast values at data time t.

Results and Discussion
In the case of infant mortality rates, the data obtained is in the form of a trend and has decreased every year (see Fig. 2).

Model Specification
To obtain forecasting models and forecasting results from data using ARIMA, Neural Network Time Series, Holt-Winter, and SutteARIMA models, we use the alpha.sutte function on SutteForecastR package on Software and the output results from R Software are presented as follows.

Estimating the Forecasting Model
After obtaining the forecasting model in the specification model section, the result of forecasting for testing data are shown in Tabs. 1-4. The forecasting results are compared with the testing data to obtain the value of absolute percentage error (APE) and square of error (SE).

Model's Forecasting Performance Comparison
Based on the results of forecasting for testing data from various prediction methods that have been presented previously, the comparison of the forecasting results is presented in Fig. 3. In Fig. 3 it can be seen that SutteARIMA has the highest level of accuracy in predicting infant mortality rates. This was followed by the Holt-Winters, ARIMA(0, 2, 2), and NNAR(1, 1). The results of the forecasting graphs of each forecasting method for testing data are presented in Fig. 4. From Fig. 4, it can be seen that the ARIMA(0, 2, 2), Holt-Winters, and SutteARIMA methods go hand in hand as shown in Fig. 4 which shows MSE and MAPE results. In fact, these three methods are close to or not too far away and differ from the NNAR(1, 1) whose forecasting is inaccurate with the testing data. Based on these results, SutteARIMA is used as a method for forecasting the next 5 periods or years (Tab. 5).     Based on the forecast results in Tab. 5, it can be seen that there is a decrease in the infant mortality rate from year to year. This result is in line with the opinion of Mishra et al. [10]; Kurniasih, et al. [19], and Hussein [20] who said that the infant mortality rate has decreased from year to year.

Conclusion and Impact
The purpose of this research is to model the infant mortality rate data and find the best model to predict this problem in the future. To achieve this goal, four model are used (ARIMA, Holt-Winters, Neural Network Time Series, and SutteARIMA) to predict the infant mortality rate data. To determine which prediction model is more suitable and precise in predicting data, the MAPE and MSE values of each of the forecasting methods used are calculated and the results are compared according to the predetermined performance criteria. Based on the findings of this study, it is concluded that the better or more suitable model, with smaller forecast errors in the infant mortality case data, is SutteARIMA which is then followed by Holt-Winters, ARIMA, and NNAR. And based on data trends and forecast results, the infant mortality rate is decreasing from year to year. The SutteARIMA method provides an estimated infant mortality rate for 2020 of 19.7557 and 17.9185 for 2024, a decline from 2019. These findings have the potential to help promote policies in order to address and minimize infant mortality rates in the coming years and can be used as a basis for implementing appropriate strategies to overcome them so that Indonesia's SDGs targets can be achieved. Although the infant mortality rate is predictable and has a satisfactory level of accuracy, it is possible that the results of this prediction are not precise due to human behavior and policies taken by policy makers.