Improved Prediction and Understanding of Glass-Forming Ability Based on Random Forest Algorithm

: As an ideal material, bulk metallic glass (MG) has a wide range of applications because of its unique properties such as structural, functional and biomedical materials. However, it is difficult to predict the glass-forming ability (GFA) even given the criteria in theory and this problem greatly limits the application of bulk MG in industrial field. In this work, the proposed model uses the random forest classification method which is one of machine learning methods to solve the GFA prediction for binary metallic alloys. Compared with the previous SVM algorithm models of all features combinations, this new model is successfully constructed based on the random forest classification method with a new combination of features and it obtains better prediction results. Simultaneously, it further shows the degree of feature parameters influence on GFA. Finally, a normalized evaluation indicator of binary alloy for machine learning model performance is put forward for the first time. The result shows that the application of machine learning in MGs is valuable.


Introduction
Bulk metallic glass (MG) has attracted much attention due to their unique mechanical and functional properties since they were first discovered more than 50 years ago [1][2][3]. Up to now, one of the biggest problems blocking the development and application of MG is still the problem of glass-forming ability (GFA) [4]. The GFA of alloys is usually defined as the critical cooling rate. Beyond this critical cooling rate, the liquid will undergo glass transition into the glassy state without crystal formation. In theory, even monatomic metals can be obtained from liquid with enough cooling rate [5]. However, for industrial applications, most alloys cannot form MGs at a cooling rate of 10 1 -10 6 K/s, which severely limits their application [4]. In order to design and develop bulk MGs with good GFA (critical cooling rate should be lower than 10 2 K/s for bulk MGs), a lot of efforts have been made, and various empirical criteria [6][7][8][9][10] and parameters [11][12][13] for predicting GFA of metal alloy systems have been proposed. At present, three basic empirical rules have been formed: multicomponent alloys containing three or more elements, significant atomic size difference, and large negative heat of mixing among the major components [14][15][16][17][18]. However, such empirical rules cannot systematically and quantitatively explain the GFA of alloys. At the same time, some parameters related to glass transition have been proposed to measure the GFA ability of alloy materials, such as the reduced glass temperature Trg [11], semi-empirical glass-forming tendency formula Kgl [12], bulk Fe-Nd-P alloy system parameter Tx/(Tg+Tl) [13], etc. However, these parameters have not been integrated into a complete set of theories that can describe how these parameters work together. Recently, some studies have linked GFA with geometric packing [9,14,15], mixing enthalpy [16], correlation radius [17], etc., hoping to improve the prediction of GFA. Although the above empirical criteria and parameters have provided useful help for the development of MGs, generally speaking, the relevant research only analyzes and studies the influence of a limited number of variables (data) on GFA, or only focuses on the influence of certain alloy composition on GFA. Therefore, it is difficult to give a standard and truly effective guidance for the development of new MGs by predicting alloy composition, which leads to the fact that many MGs are developed more or less through repeated experiments. Recently, recent work [18][19][20] not only shows that the number of possible crystalline phases is an efficient indicator for the GFA of an alloy, but also reveals the possibility of establishing a model considering different parameters at the same time, and further shows that all available variables must be analyzed for an ideal model to predict the formation of MGs.
To establish a model for predicting GFA is to find the correlation between GFA and other related parameters. In the face of numerous experimental data of MGs, how to find the relationship between the relevant parameters and GFA from the data is a challenging work from the perspective of physics and materials. However, from the perspective of computer science, this problem will evolve into data analysis of big data modeling and processing. After decades of development, machine learning has become a promising data processing method [21]. Machine learning can predict unknown data by training existing knowledge. Material design involving machine learning has been successfully applied in various research fields [22][23][24][25][26][27]. In particular, Norquist's work [23] reveals valuable but long-neglected information hidden in failed experiments. Recently, some progress has been made in GFA prediction of binary alloys such as Sun et al. [24] used machine learning to predict GFA. Their research has proved that machine learning can fundamentally speed up the research of GFA prediction, has great help to explore the formation law of MGs, and provides a new way to further solve the challenging and elusive GFA problem.
In this work, we construct a new machine learning model based on the existing binary alloy data and use the random forest classification method to predict the GFA of a binary alloy. It has been proved that it is feasible to select the alloy with good GFA from all possible alloy compositions. In the experiment, we found a new combination of input descriptors which makes the machine learning model achieve higher prediction performance. Meanwhile, through the random forest algorithm, we get the feature importance of each input descriptor (also known as "feature" in machine learning) to GFA. The evaluation indicator of the model with standardized characteristics and the corresponding experimental results are given. Hoping that our work can be helpful to the related work of binary alloy GFA.

Machine Learning Algorithm and Its Application in Binary Alloy
Binary alloy is a relatively simple and important alloy material with extensive theoretical and experimental data in MGs, which has a very high application value [25]. The exploration of binary alloy materials can bring great convenience to the material life of all mankind. However, it is difficult to obtain good prediction and synthesis results both theoretically and experimentally. Due to the difficulty of prediction, Sun et al. [24] used SVM (support vector machine, a machine learning algorithm) to train the existing binary alloy data set to obtain the machine learning model and predict the binary alloy with unknown GFA. It has proved that machine learning is useful in binary alloys.
Machine learning has become a popular computer technology recently. Machine learning is dedicated to learning how to improve the performance of the system through calculation and by using experience. In computer science, experience usually exists in the form of data. Therefore, the main research content of machine learning is about the algorithm of generating a model from data on a computer. The unknown data can be predicted by inputting the unknown data into the machine learning model. So as to achieve the purpose of predicting and classifying unknown data. The experimental flowchart for binary alloy learning is shown in Fig. 1. The ellipse on the left represents known data, which contains the vector of input descriptors (feature) value and the ability to form MG (expressed by 1 or 0). In this paper, the feature selection follows the "All" features mentioned in the research [24] (aw1 & aw2 is atomic weights, ΔH is mixing enthalpy, r1 & r2 is atomic radio, Tliq1 & Tliq2 is liquidus temperatures, Tfic = Tliq1 × C1 + Tliq2 × C2, ΔTliq = (Tfic − Tliq)/Tfic). Besides, C1 and C2 are added to represent the contents of each element. The box in the middle represents the machine learning model. A machine learning model can be obtained by setting algorithm parameters and training with known data on the left. The dotted line represents the reuse of machine learning models. Input unknown data (ellipse at the top right of Fig. 1) to the trained machine learning model, and output the prediction result (ellipse at the bottom right of Fig. 1). The output result is 0 or 1, which indicates whether the corresponding unknown data can form mg. It should be noted that the unknown data in the upper right only inputs the value of the feature, and does not need to input 0 or 1. Therefore, the problem of binary alloy GFA can be predicted by machine learning technology. This can effectively improve the success rate and efficiency of metallic glass experiments, so machine learning has a very broad application prospect of MG.

Random Forest Algorithm and Environmental Information
Although according to Sun et al. [24], they innovatively integrate GFA and machine learning by using SVM (a machine learning algorithm) and data sets from known binary alloy GFA to advance GFA prediction problem. However, through the research and experiment of research [24], the following points are worthy of further discussion: Firstly, using different machine learning algorithms can further reveal the difference of importance between features, but SVM cannot (due to the modeling of the algorithm itself). Secondly, when different feature combinations are used in experiments, different results can be obtained. Maybe some feature combinations can achieve better prediction results. Thirdly, there is still room for improvement in the evaluation indicator of binary alloy data machine learning model. Based on the above three points, we are interested in this research. To further reveal the importance differences between features; select more appropriate features; discard unnecessary features, the random forest algorithm is adopted. Random forest (RF) is a kind of statistical learning theory. It uses the bootstrap resampling method to extract multiple samples from the original samples, and then builds a decision tree model for each bootstrap sample, and then combines the predictions of multiple decision trees to cast the final prediction results by fair voting [26][27][28][29][30]. A large number of theoretical and empirical studies have proved that RF has a high prediction accuracy, has a good tolerance for outliers and noise, and is not prone to overfitting. It can be said that RF is a kind of natural nonlinear modeling tool, and it is one of the most popular algorithms in the frontier research field of data mining and materials science. Besides, the random forest algorithm model is generated based on the decision tree algorithm model, and the algorithm library contains the "feature importance" attribute, which can explain the importance of features. Therefore, we choose RF algorithm to train and predict data. It is expected to reveal the most important features in the GFA problem of binary alloys by machine learning.
This experiment uses the random forest algorithm from python machine learning library sklearn. This experiment uses Python 3.7, and machine learning library uses sklearn 0.21.3. The three main parameters in random forest can be used to adjust to get different models. The three parameters are as follows: the number of trees per forest "n_estimators"; the criteria of branching "criterion"; whether to adopt OOB verification strategy "oob_score". After random forest training, the target data set and test set are used for prediction.

Data
The research on the GFA of alloys has been going on for decades, and many MGs have been synthesized experimentally. To establish a proper random forest database, it is necessary to master the information of good glass formers and bad glass formers at the same time, so that the machine can learn the difference to separate them. Unfortunately, the bad ones are not always reported. We found, from published papers [31], binary alloys with known compositional range to form MG or not to form MG. For each binary alloy, we collected 91 data points with composition ranging from 5% to 95% (with an increment of 1%). These data are used as the training data set.
For general machine learning process, a subset is separated from these data and evaluated as an independent test data set. However, the size of the training data set here is not large enough for such a process. Any reduction in the training data set is likely to change the performance of the model to a large extent; therefore, we must find another way to evaluate the model. In this work, we used a test data set consisting of two groups of data. The "Target" group is a group of 339 binary alloys, which are reported to form MG by the melt-spun technique. These data are all collected by Laws [9]. The other group is named "All" because it consists of all possible binary components (1131 pairs with atomic numbers less than 82) with available input data. In this way, the prediction efficiency of a model is evaluated by its ability to separate the data from the groups "Target" and "All".

Model Performance Indicator
E old is an indicator proposed by research [24] to evaluate the performance of a machine learning model on binary alloy data sets. Group Target represents the data that is known to get good GFA. The Group All represents all possible binary alloy data. After testing the group All and the group Target, PTar and PAll got. PAll, PTar is defined as Eq. (1) The ideal model should pursue the higher the ratio of yes in the Group Target, the better. The lower the ratio of yes in Group All, the better. This is in line with the actual attempt. Because good GFA binary alloys only account for a very small proportion of all binary alloys. The purpose of squaring the molecule is to control the Eold value not to be too large.
According to the conclusion of study [24], the best model should be satisfied the following two requirements: (1) PTar > 0.3 (2) largest Eold.

Feature Importance
Because we want to explore impact of each feature on GFA, we add another 1 feature that is composition (also known as C1, which is the proportion of the first element) based on inputting 9 features (All, mentioned in Part 2). 10 features (All + C1) are obtained for the experiment, and the machine learning model is obtained by training. As shown in Fig. 2, the proportion of the importance of each feature can be obtained by calling the feature importance attribute in the classifier trained by the random forest algorithm in the sklearn library. It can be seen that the sum of four feature importance of C1, ΔTliq, Tfic, and ΔH accounts for more than 70%. And C1 is the most important feature, accounting for 23.3%.

Improvement Compared with the Previous Research
From the pie chart of feature importance obtained in Fig. 2, it can be seen that the composition feature (i.e., C1) is not only a feature that cannot be ignored in binary alloy GFA but also one of the most important features in judging binary alloy GFA. However, in the study [24], "poor performance is obtained if we add C1 and C2 into the data sets", and explain "our interpretation is that although the content of each element is, of course, very important in designing MGs it might not directly correlate with GFA", What is the cause of this contradiction?
When the All features combination (mentioned in Part 2) is tested, the feature combination of the All + C1 + C2 (adopted in the study [24]) is directly tested, but the All + C1 features combination is not tested. The experimental results in reference [24] show that the All + C1 + C2 features combination is even worse than all features combination. This result is not consistent with our empirical intuition. This result is not in line with our empirical intuition. However, in the binary alloy data set, we found that: given the binary alloy C1 is the proportion of the first element, C2 is the proportion of the second element, then in the binary alloy, we can get C1 + C2 = 100% (because there are only two elements in the binary alloy, the sum of the two elements is 100%, so the numerical value of C1 + C2 = 100%). From the perspective of machine learning, the C1 feature and the C2 feature are strongly correlated (because numerically C1 + C2 = 100%). For a machine learning task, it belongs to redundant features, which will affect the accuracy of machine learning. This is also a possible reason why the machine learning model produced by the All + C1 + C2 features combination is worse than that of the All features combination. As shown in Fig. 3, it is the result of training and testing different features combinations of binary alloy data by using RF algorithm. It can be seen that when three features combinations All, All + C1, and All + C1 + C2 are used, their effects are different. Among them, All + C1 has the best performance (because this feature combination contains C1, C1 is a very important feature shown in 3.1), All + C1 + C2 has the lower performance (because its feature combination contains C1 and C2, redundant features will seriously affect the performance of machine learning), and the performance of All is in the middle of the two (although it does not contain redundant features, the most important feature C1 is excluded). According to this experiment, it is not appropriate to only use the All features combination and ignore the influence of the C1 feature.

Experimental Results under New Features Combination
According to the pie chart generated by the importance ratio of features obtained in Section 3.1, we decided to select some feature combinations with high importance for the experiment. When All + C1-Tliq1 -Tliq2 are used, the experimental results are better than all the previous features and models. The experimental environment adopts the same parameter settings as mentioned in Section 3.1. Through grid search, it is found that Eold reaches the maximum value of 5.6 (average value of ten experiments) when parameters "n_estimators = 150, oob_score = True, criteria = 'gini'". As shown in Fig. 4, the experimental results of this experiment and study [24] are compared. The red bar, which is experiment number 15, is the result of our experiment. Experiment numbers 1-14 are the results of the study [24]. The Y-axis is the Eold indicator of each machine learning model after training and testing. In this experiment, the red bar means All + C1 -Tliq1 -Tliq2 features combination. These eight features (c1, aw1, aw2, Δh, r1, r2, Tfic, ΔTliq) have achieved the best performance under the original Eold of 5.6. It shows that the model obtained by using feature selection such as All + C1 -Tliq1 -Tliq2 for binary alloy machine learning can get more accurate performance for GFA prediction.

New Indicator Enew
In the process of the experiment, we found that the old model performance indicator Eold cannot reflect the performance of the model perfectly. If only comparing two or more different machine learning models, Eold is enough. But we hope to understand the gap between our machine learning model and the real ideal model more comprehensively and intuitively. Eold only compares the quality of the models horizontally, so we set up a new indicator Enew. The new indicator Enew is standardized. Therefore, Enew can not only make a horizontal comparison but also describe the distance from the ideal model. At the same time, inherited from the original index, the best model should meet two requirements (1) PTar > 0.3. (2) smallest Enew. The best random forest algorithm under Enew index gets 0.275 when using All + C1 -Tliq1 -Tliq2 features combination. When using All + C1 features combination, the result of 0.2931 is obtained. It can be seen that although Enew has achieved a certain prediction effect, machine learning still has a lot of room for improvement in the prediction of binary alloy GFA.
Comparison: To further show and compare the advantages and disadvantages of the Enew and Eold indicators, we drew Fig. 5(a) and Fig. 5(b). In Eold, the pursuit is the highest vertical coordinate value, that is, the yellow part of Fig. 5(a). It can be seen that Eold is not standardized. As shown in Fig. 5(b), it is the performance of Enew with different PTar and PAll. Under the Enew indicator, the minimum value of the ordinate of the point in the graph is what we want, that is, the purple part of Fig. 5(b). The lowest point (purple part) of Fig. 5(b), Enew tends to 0 is the most ideal machine learning model. Enew achieves the goal of standardization and successfully controls the value between 0 and 1. Meanwhile, there are other differences between Enew and Eold. As shown in Tab. 1, is the difference between the two indicators. Enew has achieved the effect of standardization. The Enew value range is between (0, 1). It is easy to know the gap with the ideal model.
The value range of Eold is [0, +), which can only be used to compare the performance of the two models horizontally, and cannot describe how far from the ideal model.
For the wrong situation, such as PAll = 1 and PTar = 1, the intuition of misjudging excellent indicators will not be generated.
In the process of learning and classifying algorithm, it is easy to find that PAll value tends to be infinitesimal and Eold value tends to be maximum, which is easy to misjudge the generated Eold value.
Inherited from Eold indicator, Eold pursue the highest PTar and the smallest PAll.
In a disguised way, PAll is encouraged to approach zero.

Conclusion
In this paper, a new model using random forest classification algorithm based on the existing binary alloy data is proposed and applied to predict the GFA of binary alloys. In the experiment, we use 8 feature combination All + C1 -Tliq1 -Tliq2 [24] to get the highest Eold value 5.6. It further found that C1 (composition) is the most important feature of binary alloy GFA by the comparison among different features, and the total importance percent of C1, ΔTliq, Tfic, and Δh is more than 70%. Finally, we also propose a standardized model performance evaluation indicator Enew. The optimal Enew value 0.275 is obtained by experiments, and the superiority of this indicator is demonstrated, which can be used to objectively describe a better binary alloy machine learning model. It is believed that the application of machine learning in MGs can greatly promote the development of this subject in the future.