To overcome the deficiency of traditional mathematical statistics methods, an adaptive Lasso grey model algorithm for regional FDI (foreign direct investment) prediction is proposed in this paper, and its validity is analyzed. Firstly, the characteristics of the FDI data in six provinces of Central China are generalized, and the mixture model's constituent variables of the Lasso grey problem as well as the grey model are defined. Next, based on the influencing factors of regional FDI statistics (mean values of regional FDI and median values of regional FDI), an adaptive Lasso grey model algorithm for regional FDI was established. Then, an application test in Central China is taken as a case study to illustrate the feasibility of the adaptive Lasso grey model algorithm in regional FDI prediction. We also select RMSE (root mean square error) and MAE (mean absolute error) to demonstrate the convergence and the validity of the algorithm. Finally, we train this proposedal gorithm according to the regional FDI statistical data in six provinces in Central China from 2006 to 2018. We then use it to predict the regional FDI statistical data from 2019 to 2023 and show its changing tendency. The extended work for the adaptive Lasso grey model algorithm and its procedure to other regional economic fields is also discussed.
Adaptive lasso grey model algorithmregional FDI statisticsmean value of regional FDImedian value of regional FDIIntroduction
Economic development varies from country to country, and the influencing factors of the regional FDI are also varied [1–3]. The regional FDI statistics can accurately and effectively describe the relationship among basic situation, influencing factors and investment trend of regional FDI.
How to effectively predict regional FDI statistics to improve the regional economy is a complicated problem. In the past decade, many traditional statistical methods [4–7] have been proposed to solve this problem. Being empirical or semi-empirical, these models can provide neither specific assumptions nor sufficent statistical data.
Lasso's method can effectively overcome the above deficiency of the traditional statistical methods. In this method,proper variables with a significant impact can be selected to reduce the complexity of data [8] and display the influence of all variables on the estimated parameters [9]. However, Lasso’s method has some defects in precision. The adaptive Lasso method [10] assigns different weights to different coefficients to improve the accuracy of calculation parameters.
In recent years, many studies have shown that the grey theory is a valid method that can correctly predict the properties in some fields [11–13] by mining some available information and extracting valuable key information. The regional FDI system is a typical grey system suitable for the grey model with both the evident hierarchy complexity and the constant change, and its index characteristic data is uncertain and incomplete [14–21]. Therefore, it is feasible to combine the adaptive Lasso method and the grey model, i.e., to establish an adaptive Lasso grey algorithm to predict regional FDI statistics.
Adaptive Lasso Grey Model Algorithms Predicting Regional FDI Statistics
Many methods [22,23] have been proposed to solve the Lasso problem. However, these methods can only deal with big data, not minor data problems. Therefore, the adaptive Lasso model [24] and the grey model [25–29] are needed to precisely calculate the predicted value. Based on characteristics [30,31] and the regional FDI statistics variables, the main algorithm in this paper is described as follows.
According to the data of regional FDI, let: x1,x2,…,xp represents the factors, and n denotes the sample number. The sample matrix can be described as X=[x11x12…x1px21x22…x2p⋮⋮⋱⋮xn1xn2…xnp], where xj=[x1jx2j⋮xnj]. So, FDI statistics are represented as y=[y1y2⋮yn] andβ=[β1β2⋮βp].
Adaptive Lasso Grey Model Algorithm Predicting Regional FDI Statistics:
Step 1: Investigate the possible factors of regional FDI and obtain their specific data.
Step 2: Set the value range of the variables sample matrix {x1,x2,…,xp} and the learning top limit T.
Step 3: Specify the required statistics (the regional FDI data, the mean and median values influencing factors) and get the statistical matrix X.
Step 4: Initialize β and solve the least-squares estimation y = Xβ, then get β.
Step 5: Compute the weight vector:
ωj^=1βj(j=1,2,…,p).
Step 6: For the adaptive Lasso model: minβ{‖y−Xβ‖2+λ∑j=1pωj^|βj|}, set xj∗=xjωj^(j=1,2,…,p), and establish the substituted model:
minβ{‖y−∑j=1pxj∗(ωj^βj)‖2+λ∑j=1pωj^|βj|}.
Step 7: Set β∗=ω^β, f(β∗)=‖y−∑j=1pxj∗β∗‖2 and g(β∗)=∑j=1p|β∗j|=‖β∗‖1, where βk∗ is the kth result of β∗, then compute βk+1∗:
Step 8: Let F(β∗)=L2∑j=1p(βj∗−zj)2+λ∑j=1p|βj∗|. For βj∗ (the jth result of β∗), obtain the optimal values: βj∗: ∂F(x)∂βj∗=0, then compute:
βj∗=sgn(zj)⋅max(|zj|−λL,0).
Step 9: If |(f(βk+1∗)+λg(βk+1∗))−(f(βk∗)+λg(βk∗))|<10−4 or the learning number reaches T, end the algorithm, otherwise jump to Step 4.
Step 10: Establish the adaptive Lasso model according to βj∗^=β∗ωj^, j=1,2,…,p.
Step 11: Select x, get xj=[x1j,x2j,…,xnj]T, set φi(0)=xij, i=1,2,…,n, and compute φ(1)=(φ1(1),φ2(1),…,φn(1)), where βj∗^ is non-zero, and
φr(1)=∑i=1rφi(0)=φr−1(1)+φr(0),r=1,2,…,n.
Step 12: Substitute φ(1) into the grey model dφ(1)dt+aφ(1)=bφ=Bϕ, where φ=[φ2(0)φ3(0)⋮φn(0)], B=[−12[φ2(1)+φ1(1)]1−12[φ3(1)+φ2(1)]1⋮⋮−12[φn(1)+φn−1(1)]1], and ϕ=[ab]. Compute ϕ through
[ab]=[a^b^]=(BTB)−1BTφ.
Step 13: Solve dφ(1)dt+aφ(1)=b, get φt(1)=(φ1(1)−ba)e−a(t−1)+ba, set r=t−1, compute φr+1(1)=(φ1(1)−ba)e−ar+ba and restore φr+1(1) to φr+1(0) after accumulation:
if r=1,2,…,n−1, compute the fitted value φr+1(0), or compute the predicted value φr+1(0).
Step 15: For all βj∗^ non-zero factors xj with non-zero βj∗^, repeat Steps 11–14, predict the values of xj(j=1,2,…,p) in the next predicting years.
Step 16: Establish the adaptive Lassogrey model (Step10) and compute the regional FDI statistics y for future years.
A Case Study
Among many forms of regional FDI statistics,this paper only considers the mean and median values to illustrate the feasibility and effectiveness of our proposed algorithm.. Taking six provinces of Central China as the case for study, through numerical analysis of regional FDI, their overall regional FDI capacity is judged [32], which provides reference for formulating related policies.
In this case study , we select the data of regional FDI from 2006 to 2018, such as the annual regional GDP (x1), the average wages (x2), the total investment value in fixed assets (x3), the highway mileage (x4), the total import and export trade value (x5), the ratio of the industrial added value increment (x6), the expenditure of government personnel (x7), the total freight (x8), the total retail sales of the consumer goods (x9), the number of patents (x10), the proportion of the fiscal expenditure in GDP (x11), the number of the designated size industries (x12), the number of students in higher education (x13) and the amount of FDI inflows in the previous five years (x14), etc. The mean and median values of the above 14 factors were taken as the input data of the algorithm. To verify the feasibility and effectiveness of the algorithm, 80% of the samples are randomly selected as training samples, and the remaining 20% as testing samples. The natural logarithm of the time series data is processed to eliminate various characteristics on the data. Note that we only select the mean and median values shown in Tabs. 1 and 2, respectively, due to the limited space. For specific data, please refer to the China Statistical Searbook and the Provincial Statistical Yearbooks in China.
Mean values and factors of regional FDI in Central China
y
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
x11
x12
x13
x14
1
13.73
10.05
10.84
10.06
2.99
15.16
4.68
4.27
12.14
9.18
10.34
3.11
9.43
4.75
14.99
2
12.60
9.23
10.05
8.66
2.81
14.33
4.47
3.26
11.64
8.17
8.45
2.76
9.22
4.48
13.67
3
13.79
10.23
11.02
10.07
3.04
15.27
4.67
4.41
12.30
9.39
10.49
3.07
9.44
4.80
15.26
4
13.53
9.92
10.67
9.75
2.95
15.04
4.55
4.10
12.12
8.96
9.95
3.02
9.34
4.68
14.64
5
12.49
9.03
9.90
8.40
2.78
14.05
4.47
3.02
11.31
7.96
8.28
2.70
8.99
4.41
13.38
6
13.65
9.99
10.76
9.91
2.97
15.15
4.43
4.16
12.21
9.08
10.03
3.02
9.39
4.71
14.81
7
12.81
9.52
10.32
9.17
2.88
14.47
4.50
3.61
11.86
8.50
9.26
2.89
9.35
4.58
14.08
8
13.19
9.72
10.48
9.33
2.90
14.78
4.51
3.84
11.99
8.68
9.49
2.94
9.16
4.61
14.25
9
13.38
9.83
10.60
9.55
2.93
14.92
4.51
3.97
12.11
8.83
9.82
3.00
9.25
4.65
14.46
10
12.59
9.32
10.18
8.97
2.83
14.07
4.56
3.44
11.71
8.32
8.80
2.90
9.27
4.54
13.89
11
13.78
10.12
10.92
10.12
3.02
15.12
4.54
4.34
12.19
9.29
10.41
3.09
9.45
4.78
15.15
12
12.03
8.84
9.72
8.13
2.76
13.72
4.54
2.81
11.19
7.78
7.99
2.67
8.87
4.34
13.16
13
13.91
10.32
11.11
10.16
3.04
15.40
4.58
4.51
12.38
9.44
10.79
3.08
9.47
4.83
15.35
Median values and factors of regional FDI in Central China
y
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
x11
x12
x13
x14
1
13.86
10.13
10.87
10.12
3.05
15.31
4.55
4.25
12.08
9.24
10.50
3.10
9.63
4.75
15.09
2
12.78
9.22
10.08
8.65
2.81
14.33
4.46
3.30
11.71
8.18
8.55
2.79
9.37
4.47
13.72
3
14.07
10.32
11.04
10.32
3.09
15.39
4.65
4.41
12.24
9.46
10.64
3.07
9.63
4.79
15.35
4
13.61
9.98
10.69
9.82
2.99
15.13
4.54
4.11
12.04
9.01
10.18
3.04
9.55
4.68
14.79
5
12.62
9.03
9.91
8.39
2.78
14.03
4.47
3.07
11.42
7.97
8.39
2.70
9.05
4.43
13.39
6
13.74
10.07
10.78
9.98
3.01
15.28
4.53
4.16
12.11
9.13
10.22
3.02
9.60
4.71
14.94
7
13.13
9.55
10.29
9.24
2.87
14.55
4.49
3.64
11.82
8.53
9.61
2.92
9.61
4.60
14.20
8
13.32
9.76
10.46
9.37
2.88
14.98
4.51
3.85
11.92
8.73
9.77
2.97
9.35
4.63
14.39
9
13.47
9.88
10.59
9.60
2.94
15.02
4.50
3.99
12.02
8.87
10.08
3.03
9.44
4.66
14.60
10
12.89
9.35
10.18
8.98
2.83
14.15
4.55
3.44
11.69
8.33
9.04
2.92
9.52
4.55
13.99
11
13.96
10.22
10.94
10.19
3.08
15.24
4.54
4.35
12.13
9.36
10.54
3.07
9.64
4.77
15.23
12
12.27
8.83
9.71
8.13
2.77
13.71
4.55
2.83
11.29
7.80
8.09
2.64
8.86
4.38
13.11
13
14.17
10.41
11.12
10.42
3.10
15.47
4.59
4.49
12.30
9.53
10.85
3.06
9.66
4.81
15.47
By the above algorithm of the adaptive Lasso, the estimated coefficients of the specific data for regional FDI in Central China are computed and outlined in Tab. 3.
Mean/median adaptive Lasso estimation coefficients of regional FDI in Central China
β1
β2
β3
β4
β5
β6
β7
Based on mean values
0.0000
0.0798
0.4013
0.0000
0.7393
0.0000
0.0000
Based on median values
0.0000
0.1474
0.0000
1.2842
0.1035
−0.0877
0.3749
β8
β9
β10
β11
β12
β13
β14
Based on mean values
0.3314
0.0000
0.0000
0.4883
0.4537
0.0000
0.0000
Based on median values
−0.1131
0.0000
0.0000
−0.6243
0.0000
1.5757
0.0000
It can be seen from the second line in Tab. 3 that x1, x4, x6, x7, x9, x10, x13 and x14 are eliminated because their coefficients are all 0 according to the algorithm used to calculate the mean value of regional FDI. Similarly from the third line, x1, x3, x9, x10, x12 and x14 are independent of the median value of regional FDI. Also, the mean and their impact intensity are different from those of the median. The algorithm can eliminate variables and has unique advantages in the case of multiple indicators.
In order to verify the effectiveness and rationality of the adaptive Lasso grey model algorithm,RMSE (the root mean square error) [22] and MAE (the mean absolute error) are selected to evaluate it. Set h(xi),(i=1,2,…,n) as the computed results and yi,(i=1,2,…,n) as the actual values, RMSE and MAE can be represented as follows:
RMSE=∑i=1n(h(xi)−yi)2n,
MAE=1n∑i=1n|h(xi)−yi|.
RMSE and MAE can be computed by this algorithm, and the results are shown in Tab. 4. It can be found that RMSE and MAE are relatively small, indicating that the selected variables can well reflect the factors related to the regional FDI statistics.
Error analysis
FDI statistics
RMSE
MAE
Mean
0.0949
0.0830
Median
0.0893
0.0626
Based on the coefficients in Tab. 3, we select six primary factors affecting the mean value of FDI and eight main factors affecting the median value of FDI, and used the remaining part of the algorithm to predict the factors affecting regional FDI statistics from 2019 to 2023. The prediction accuracy is shown in Tabs. 5 and 6. The predicted and actual values of the affecting variables are obtained through Python and plotted in Figs. 1 and 2.
Accuracy of related factors for mean values of regional FDI in GM (1, 1)
x1
x3
x5
x8
x11
x12
2019
11.28
10.6
15.62
12.53
3.19
9.54
2020
11.40
10.78
15.75
12.61
3.22
9.57
2021
11.51
10.96
15.88
12.7
3.26
9.61
2022
11.63
11.15
16.02
12.79
3.3
9.64
2023
11.75
11.34
16.15
12.87
3.33
9.68
Accuracy
Great
Great
Great
Qualified
Great
Qualified
Accuracy of related factors for median values of regional FDI in GM (1, 1)
x2
x4
x5
x6
x7
x8
x11
x13
2019
11.30
3.16
15.78
4.61
4.79
12.40
3.17
4.88
2020
11.42
3.20
15.92
4.62
4.94
12.48
3.20
4.91
2021
11.54
3.23
16.06
4.63
5.10
12.55
3.23
4.95
2022
11.65
3.27
16.20
4.64
5.26
12.62
3.26
4.98
2023
11.77
3.30
16.34
4.66
5.43
12.69
3.30
5.02
Accuracy
Great
Great
Great
Barely qualified
Great
Great
Qualified
Great
Predicted and actual factor values of mean regional FDI
Predicted and actual factor values of median regional FDI
It can be seen from Figs. 1 and 2 that the predicted factor values of regional FDI statistics are close to the actual factor values, which indicates that what is predicted is valid. Moreover, Tabs. 5 and 6 demonstrate that these explanatory variables have many advantages, and various regional FDI statistics have different affecting factors. Considering the computed results and error analysis, the prediction accuracy of this algorithm is gererally satisfying, and the grey GM (1, 1) model combined with the adaptive Lasso model has a good effect on short-term single-factor prediction.
Using the adaptive Lasso grey model algorithm, the statistical data of regional FDI in six provinces of Central China from 2006 to 2023 were predicted . The comparison between the predictedand actual values of regional FDI is shown in Fig. 3.
Predicted and actual values of regional FDI statistics in Central China
Fig. 3 shows that the predicted values from 2006 to 2018 are very close to the actual value, and demonstrates that the adaptive Lasso grey model algorithm is valid in regional FDI statistics. It should be noted that no correlational FDI value could be forecasted with the fast change of the main factors of FDI statistics, the reasons of which is the focus of our future work.
Conclusions
By optimizing some traditional mathematical statistical methods, this paper proposes an adaptive Lasso grey model to predict regional FDI statistics. Based upon the characteristics of FDI data of six provinces of Central China, a test was designed to verify the effect of this adaptive Lasso grey model. Meanwhile, the feasibility and validity of the main algorithm of regional FDI statistics are demonstrated. This study also shows that the adaptive Lasso grey model with its algorithm and procedure can be extended to regional GDP and income study..
The author would like to thank the equipment support of Changsha University of Science and Technology as well as the support of the Fund Project.
Funding Statement: This work was supported in part by the National Key R&D Program of China (No. 2019YFE0122600), author H. H, https://service.most.gov.cn/; in part by the Project of Centre for Innovation Research in Social Governance of Changsha University of Science and Technology (No. 2017ZXB07), author J. H, https://www.csust.edu.cn/mksxy/yjjd/shzlcxyjzx.htm; in part by the Public Relations Project of Philosophy and Social Science Research Project of the Ministry of Education (No. 17JZD022), author J. L, http://www.moe.gov.cn/; in part by the Key Scientific Research Projects of Hunan Provincial Department of Education (No. 19A015), author J. L, http://jyt.hunan.gov.cn/; and in part by the Hunan 13th five-year Education Planning Project (No. XJK19CGD011), author J. H, http://ghkt.hntky.com/.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
ReferencesK.Kirkham, “The formation of the Eurasian economic union: How successful is the Russian regional hegemony,” , vol. 7, no. 2, pp. 111–128, 2016.T.Ait-Izem, M.Harkat, M.Djeghaba and F.Kratz, “On the application of interval PCA to process monitoring: A robust strategy for sensor FDI with new efficient control statistics,” , vol. 63, pp. 29–46, 2018.J. R.Afonso, E. C.Araújo and B. G.Fajardo, “The role of fiscal and monetary policies in the Brazilian economy: Understanding recent institutional reforms and economic changes,” , vol. 62, pp. 41–55, 2016.X. H.Zhao, L.Niu, C.Clerici, R.Russo, M.Byrdet al., “Dataanalysisof MS-based clinical lipidomics studies with crossover design: A tutorial mini-review of statistical methods,” , vol. 13, pp. 5–17, 2019.T.Kishi, Y.Matsuda, K.Sakuma, M.Okuya and N.Iwata, “Factors associated with discontinuation in the drug and placebo groups of trials of second-generation antipsychotics for acute schizophrenia: A meta-regression analysis: Discontinuation in antipsychotic trials,” , vol. 130, pp. 240–246, 2020.Y. Y.Zou, G. L.Fan and R. Q.Zhang, “Quantile regression and variable selection for partially linear single-index models with missing censoring indicators,” , vol. 204, pp. 80–95, 2020.Z. C.Chen, Y. Q.Bao, H.Li and F.B., “SpencerLQD-RKHS-based distribution-to-distribution regression methodology for restoring theprobabilitydistributionsof missing SHM data,” , vol. 121, pp. 655–674, 2019.A.Katrutsa and V.Strijov, “Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria,” , vol. 76, pp. 1–11, 2017.C.Chang and R. S.Tsay, “Estimation of covariance matrix via the sparse Cholesky factor with lasso,” , vol. 140, no. 12, pp. 3858–3873, 2010.E.Lindström and J.Höök, “Unbiased adaptive lasso parameter estimation for diffusion processes,” , vol. 51, no. 15, pp. 257–262, 2018.B.Zeng, H. M.Duan, Y.Bai and W.Meng, “Forecasting the output of shale gas in China using an unbiased grey model and weakening buffer operator,” , vol. 151, pp. 238–249, 2018.X. P.Xiao, H. M.Duan and J. H.Wen, “A novel car-following inertiagraymodeland its application in forecasting short-term traffic flow,” , vol. 87, pp. 546–570, 2020.R.Dash, “An improved shuffled frog leaping algorithm based evolutionary framework for currency exchange rate prediction,” , vol. 486, no. 16, pp. 782–796, 2017.Q.Liu, X. Y.Xiang, J. H.Qin, Y.Tan, J. S.Tanet al., “Coverless steganography based on image retrieval of DenseNet features and DWT sequence mapping,” , vol. 192, pp. 105375–105389, 2020.Y. J.Luo, J. H.Qin, X. Y.Xiang and Y.Tan, “Coverless image steganography based on multi-object recognition,” , 2021. https://doi.org/10.1109/TCSVT.2020.3033945.J. H.Qin, J.Wang, Y.Tan, H. J.Huang, X. Y.Xianget al., “Coverless image steganography based on generative adversarial network,” , vol. 8, no. 1394, pp. 1–11, 2020.W. T.Ma, J. H.Qin, X. Y.Xiang, Y.Tan and Z. B.He, “Searchable encrypted image retrieval based on multi-feature adaptive late-fusion,” , vol. 8, no. 1019, pp. 1–15, 2020.Z. D.Wang, J. H.Qin, X. Y.Xiang and Y.Tan, “A privacy-preserving and traitor tracking content-based image retrieval scheme in cloud computing,” , 2021. https://doi.org/10.1007/s00530-020-00734-w.J. H.Qin, W. Y.Pan, Y.Tan, X. Y.Xiang and G. M.Hou, “A biological image classification method based on improved CNN,” , vol. 58, pp. 1–8, 2020.Z.Zhou, J. H.Qin, X. Y.Xiang, Y.Tan, Q.Liuet al., “News text topic clustering optimized method based on IF-IDF algorithm on spark,” , vol. 62, no. 1, pp. 217–231, 2020.T.Xu, M.Zhao, X.Yao and K.He, “An adjust duty cycle method for optimized congestion avoidance and reducing delay for wsns,” , vol. 65, no. 2, pp. 1605–1624, 2020.R.Tibshirani, “Regression shrinkage and selection via the lasso,” , vol. 15, no. 1, pp. 267–288, 1996.B.Efron, T.Hastie, I.Johnstone and R.Tibshirani, . New York: Springer, 1999.H.Zou, “The adaptive lasso and its oracle properties,” , vol. 101, no. 476, pp. 1418–1429, 2006.W. P.Wang and J. L.Deng, “Study on chaotic characteristics of GM (1,1) model in grey system,” , vol. 6, no. 2, pp. 13–16, 1997.J.Z.Chen, “GM (1,1) model and curve AeTx fitting,” , vol. 8, no. 4, pp. 67–71, 1988.P. R.Ji, X. Y.Hu and D. Q.Xiong, “Analysis and evaluation of grey prediction model,” , vol. 17, no. 2, pp. 42–44, 1999.Y.Mu, “Direct modeling method of unbiased grey GM (1,1) model,” , vol. 25, no. 9, pp. 1094–1107, 2003.Z.X. Wang, Y.G. Dang and S. F.Liu, “Analysis of chaotic characteristics of unbiased GM (1,1) model,” , vol. 11, pp. 153–158, 2007.Y. T.Chen, L. W.Liu, V.Phonevilay, K.Gu, R. L.Xiaet al., “Image super-resolution reconstruction based on feature map attention mechanism,” , 2021. https://doi.org/10.1007/s10489-020-02116-1.Y. T.Chen, L. W.Liu, J. J.Tao, X.Chen, R. L.Xiaet al., “The image annotation algorithm using convolutional features from intermediate layer of deep learning,” , vol. 80, no. 3, pp. 4237–4261, 2020.X. W.Liang, Y. L.Luo and D. Y.Peng, “Comprehensive evaluation of industrial carrying capacity in Central China,” , vol. 2020, no. 7, pp. 91–96, 2020.