TY - EJOU AU - Yu, Ziyan AU - Zhang, Cong AU - Xiong, Naixue AU - Chen, Fang TI - A New Random Forest Applied to Heavy Metal Risk Assessment T2 - Computer Systems Science and Engineering PY - 2022 VL - 40 IS - 1 SN - AB - As soil heavy metal pollution is increasing year by year, the risk assessment of soil heavy metal pollution is gradually gaining attention. Soil heavy metal datasets are usually imbalanced datasets in which most of the samples are safe samples that are not contaminated with heavy metals. Random Forest (RF) has strong generalization ability and is not easy to overfit. In this paper, we improve the Bagging algorithm and simple voting method of RF. A W-RF algorithm based on adaptive Bagging and weighted voting is proposed to improve the classification performance of RF on imbalanced datasets. Adaptive Bagging enables trees in RF to learn information from the positive samples, and weighted voting method enables trees with superior performance to have higher voting weights. Experiments were conducted using G-mean, recall and F1-score to set weights, and the results obtained were better than RF. Risk assessment experiments were conducted using W-RF on the heavy metal dataset from agricultural fields around Wuhan. The experimental results show that the RW-RF algorithm, which use recall to calculate the classifier weights, has the best classification performance. At the end of this paper, we optimized the hyperparameters of the RW-RF algorithm by a Bayesian optimization algorithm. We use G-mean as the objective function to obtain the optimal hyperparameter combination within the number of iterations. KW - Random forest; imbalanced data; Bayesian optimization; risk assessment DO - 10.32604/csse.2022.018301