|Computer Modeling in Engineering & Sciences|
Machine Learning Enhanced Boundary Element Method: Prediction of Gaussian Quadrature Points
1College of Architecture and Civil Engineering, Xinyang Normal University, Xinyang, 464000, China
2College of Intelligent Construction, Wuchang University of Technology, Wuhan, 430223, China
3School of Architectural Engineering, Huanghuai University, Zhumadian, 463000, China
*Corresponding Author: Leilei Chen. Email: firstname.lastname@example.org
Received: 30 July 2021; Accepted: 20 October 2021
Abstract: This paper applies a machine learning technique to find a general and efficient numerical integration scheme for boundary element methods. A model based on the neural network multi-classification algorithm is constructed to find the minimum number of Gaussian quadrature points satisfying the given accuracy. The constructed model is trained by using a large amount of data calculated in the traditional boundary element method and the optimal network architecture is selected. The two-dimensional potential problem of a circular structure is tested and analyzed based on the determined model, and the accuracy of the model is about 90%. Finally, by incorporating the predicted Gaussian quadrature points into the boundary element analysis, we find that the numerical solution and the analytical solution are in good agreement, which verifies the robustness of the proposed method.
Keywords: Machine learning; Boundary element method; Gaussian quadrature points; classification problems
The methods for solving partial differential equations (PDEs) are usually classified as analytical and numerical methods. Encouraged by the earlier studies, some innovative analytical methods have been proposed. By a set of constraint conditions, the generalized auxiliary equation method is employed to achieve many new exact solutions which are the hyperbolic trigonometric, trigonometric, exponential, and rational . The generalized Kudryashov method is proven to be reliable, efficient, and realistic, and is well suited for accurate isolated wave solution extraction for the Boussinesq model . However, only a few PDEs can be solved analytically. Therefore, most of the numerical methods are used in practical applications. The boundary element method (BEM) is a numerical method in solving partial differential equations (PDEs) which can be formed into boundary integral equations (BIE). In contrast to volumetric meshing method like the finite element method (FEM), the BEM only requires meshing on the boundaries of the domains to form the governing equations. After the unknowns of the boundaries are obtained, the quantities inside the domain can be evaluated straightforwardly using explicit functions in a postprocessing step. The mesh reduction property of the BEM not only decreases the dimension of the problem, but more importantly, greatly facilitates the geometric model preparation and alleviates the meshing burden. Thus, the BEM gains popularity in fracture mechanics and shape optimization, where the geometry and meshes need to be updated repeatedly. Moreover, for unbounded domains problems that are commonly encountered in acoustics and magnetics, the BEM satisfies the boundary conditions at the infinity automatically and thus a domain truncation is not needed. In recent years, the BEM has attracted more attention with the development of isogeometric analysis (IGA) [3,4]. IGA intends to bridge CAD and Computer-Aided Engineering (CAE) by employing the basis functions constructing CAD models to discretize the PDE in numerical simulations. Because both the BEM and CAD are based on boundary representation [5–7], they are naturally compatible with each other. IGA in the context of the boundary element method (IGABEM) has been successfully applied to a wide range of areas, including potential problems , linear elasticity [9,10], fracture mechanics [11–14], structural optimization [15–18], acoustics [19–25], and heat conduction [26,27], etc.
Despite the aforementioned salient features, the BEM is not without shortcomings. Firstly, BEM is not suitable for non-linear problems because of the lack of a fundamental solution, which is essential for transforming PDEs into BIEs. Hence, the BEM is not as generic as FEM, but it still plays an important role in concept design. Secondly, the coefficient matrix of the BEM is an asymmetric full-matrix, so the computational time increases rapidly with the degrees of freedom. This process can be accelerated by algorithms such as Fast Multipole Method (FMM), Adaptive Crossing Approximation (ACA), and fast Fourier transformation, etc. Thirdly, the integrand in BEM contains the fundamental solutions with singularities, which cannot be integrated exactly with Gauss-Legendre quadrature. The improvement in the Gaussian integration accuracy depends on the increase of the number of Gaussian quadrature points, but the improvement in accuracy will increase the time consumption of the calculation. Hence, it is of great significance to develop a robust, accurate, and efficient numerical integration scheme for the BEM.
With the rapid development of the digital technology and computers, machine learning has become a popular topic in computer science. As a subset of machine learning, the artificial neural network (ANN) algorithm based on the synaptic neuron model  has achieved enormous success in recent years for its ability to finding the mapping relationship between complex data quickly [29–31]. Machine learning techniques have also been incorporated into the computational mechanics. For example, the machine-learning enhanced FEM has been applied to investigate the numerical quadrature scheme , estimation of stress distribution , construction of smart elements , data-driven computing paradigm , and structural optimization . However, the machine-learning-enhanced BEM has been scarcely studied. To fill this research gap, the aim of the present paper is to use machine learning techniques to enhance the BEM performance.
In this paper, a classification algorithm based on the ANN is used to accelerate the Gaussian quadrature of the BEM. The minimum number of Gaussian quadrature points under the premise of a given accuracy is predicted by using the ANN. The remainder of this paper is organized as follows. Section 2 outlines the classification algorithm based on the ANN. Section 3 introduces the related theories of BEM for two-dimensional potential problems. Section 4 details the model construction process. Section 5 provides some numerical examples to verify the proposed algorithm, followed by the conclusion in Section 6.
2 Classification Algorithm Based on Artificial Neural Network
2.1 Multi-Classification Problem
The softmax regression algorithm is mainly used in the multi-classification problem. It is an empirical loss minimization algorithm based on the softmax model and uses multiple cross-entropy as objective function. Given a set of training data:
where represents the features of a piece of training data. For a k-tuple classification problem, the label is a k-dimensional vector with each element varying between 0 and 1. The value at the location of the largest element of the vector is 1, and the rest location are all 0. Here, we define a parameter matrix :
Each is a column vector, and the output of the model is
Once is imported, the corresponding output is obtained through the softmax function. Here, is a k-dimensional vector. The j-th component of is , which represents the probability that the model predicts that belongs to the j-th category, i.e.,
where the values of fall in the interval [0, 1], and
By combining and , the multi-classification cross-entropy loss function is defined as the objective function, which indicates the distance between the two probability distributions. The smaller the cross-entropy is, the closer are the two probabilities. A specific expression of the cross-entropy loss function is as follows:
The cross-entropy loss function in Eq. (5) can be used as the basis of the optimization algorithm, but it cannot measure the quality of a single classification algorithm. The measurement index of the classification algorithm used in this paper is Accuracy. For the multi-classification problem, the Accuracy of the model hw on data S in Eq. (1) is defined as
where the cast is defined as a logical function that maps logical true to the integer 1 and logical false to the integer 0. In Eq. (6), if model hw predicts the i-th data accurately, i.e., , then , and vice versa, . Therefore, the Accuracy can also be described as the proportion of predicted values that match the true values in data S.
2.2 Optimistic Algorithm
Many problems in machine learning can be transferred to optimization problems, which are solved through iteration. The most basic optimization algorithm is the gradient descent algorithm. Its core idea is that each step of the iteration moves in the opposite direction of the gradient of the objective function, and finally obtains the global or local optimal solution. In this paper, the stochastic gradient descent method was used. In contrast to the batch gradient descent algorithm, only one sample needs to be selected from all the training data to estimate the gradient and update the weight in the stochastic gradient descent algorithm, which greatly reduces the time complexity of the algorithm.
In the softmax regression algorithm for the multi-classification problem, the optimal parameters of the prediction model can be obtained by minimizing the objective function in Eq. (5). Since the objective function L is a convex function, the gradient of the objective function is calculated here as
Then, the parameters can be adjusted continuously according to the calculated gradient in Eq. (7) until the loss function converged.
2.3 Neural Network Algorithm
Fig. 1 shows a neural network with R layers, which contains n inputs and r outputs. Let the -th layer contain neurons for , then m1 = n and mR = r.
Each layer of the neural network is represented by two sets of parameter values. The two sets of parameter values for the -th layer are:
• An -dimensional weighting matrix , where each element represents the weight between the i-th neuron in -th layer and the j-th neuron in -th layer;
• An -dimensional bias vector , where each component represents the bias value of the i-th neuron in the -th layer.
Then, is used to represent the output of the j-th neuron in the -th layer. Let the activation function of the -th layer be , then the output of the i-th neuron of the -th layer is
where the activation function for is Rectified Linear Unit (Relu).
Finally, the output layer is transformed by softmax to obtain the probability value, which is the final output of the classification model. Combined with the real label, the cross-entropy of forward propagation is calculated. At the same time, the optimization algorithm of machine learning uses back propagation to calculate the partial derivatives of the cross-entropy loss function to the weight and bias, respectively, and then uses the stochastic gradient descent algorithm to update the parameters.
3 Boundary Element Methods
Under the given boundary conditions, the two-dimensional potential problem governed by the Laplace equation can be formulated as:
where is Laplacian operator, is a closed domain surrounded by boundary , and u and q represent the given potential and flux values on the boundary and , respectively.
According to the above partial differential equation, the boundary integral equation of the two-dimensional potential problem is:
in which u*(x, y) and q*(x, y) are fundamental solutions,
where x is the collocation point, y is the field point on the boundary, r represents the distance from the collocation point to the field point, and n(y) is the normal direction on the boundary at point y. C(x) is the jump term, which is equal to when the boundary is smooth at point x .
After discretizing the BIEs with constant boundary elements, we can obtain the following discretization formulation of Eq. (10):
where is the local coordinate of the collocation point, and uj and qj denote the potential value and flux value of the j-th element on the boundary, respectively. To perform the Gaussian quadrature, we map all the variables from the physical space to the local coordinate space , and is the determinant of the Jacobian transformation matrix.
Eq. (13) can be rewritten as
where represent the coefficient matrix. Only non-diagonal elements need to be evaluated explicitly. The non-diagonal element with Gaussian integral is expressed as follows
• N represents the number of Gaussian quadrature points required;
• RA represents the distance from the collocation point to the Gaussian quadrature points;
• wk represents the integral coefficient corresponding to different Gaussian quadrature points;
• DIST represents the vertical distance from the collocation point i to the element j;
• (X1, Y1), (X2, Y2) represents the coordinates of the two endpoints of the element, respectively.
4 Model Determination
In this section, we mainly consider how to construct a model to predict the number of Gaussian quadrature points. The specific determinate process, which can be divided into two phases, as shown in Fig. 2.
In the data preparation phase, multiple independent elements are randomly generated to construct the feature parameters. For a given precision, the number of minimum Gaussian quadrature points satisfying the above precision is calculated according to the element information. It is used as the label of training data. Then, the feature parameters and labels are combined as the training data for algorithm learning.
In the training phase, the data constructed in the preparation phase are introduced into the model for training, and the optimal model is determined by adjusting the network structure. Then, the model is verified by the prediction results on the training data. Finally, a comparison is made between the machine learning prediction and the traditional method in terms of the time to calculate the number of Gaussian quadrature points satisfying the same accuracy.
4.1 Data Preparation Phase
The data preparation phase provides training data for machine learning. Here, the parameter definition and analysis are performed on a single independent constant element in the BEM. Firstly, the element length is defined as L1. The right and left endpoints of the element are defined as 1 and 2, respectively. The green point denotes the midpoint of the element. Given a yellow point as the collocation point (source point), we set the distance from the source point to the midpoint of the element as L2. The parameters defined above are shown in Fig. 3.
The features and labels for training data are defined as follows:
• Feature: L1 and L2, where ;
• Labels: The minimum number of Gaussian quadrature points satisfying a certain precision from 2 to 10.
To obtain the training data, the following steps are required:
1. Randomly generate L1 and L2 in the feature range by the BEM program and set the precision .
2. Computing element information.
3. Use the element information generated in Step 2 to calculate the non-diagonal elements of the coefficient matrix obtained by Gaussian integration using 15 Gaussian quadrature points.
4. Use the element information generated in Step 2 to calculate the non-diagonal elements of the coefficient matrix obtained by Gaussian integration using 2–10 Gaussian quadrature points.
5. Assemble the values calculated in Steps 3 and 4 to compute the relative error. When the relative error is less than the given precision E1, the corresponding number of Gaussian quadrature points is the required label.
To find the label satisfying the precision, the formula of relative error is given as follows:
where ErrorG represents the relative errors between the coefficient matrix G and the exact solution, and ErrorH has the same meaning. Here, max is 15, which is usually large enough to make the value of the coefficient matrix close to the exact solution. Gn, g denotes the coefficient matrix G obtained by using g Gaussian quadrature points for the n-th data, and Gn, max denotes the coefficient matrix G which is close to the exact solution when max is 15. Hn, g and Hn, max represent the same values.
When ErrorG < E1 and ErrorH < E1, g is the desired label value. To better show the relationship between the relative error and the number of Gaussian quadrature points, in Eqs. (17) and (18) is used to represent the random generation of four pieces of the data in Fig. 4.
In Fig. 4, the horizontal axis represents the number of Gaussian quadrature points, and the vertical axis of the left and right figures respectively represent ErrorG and ErrorH. Among them, indicates the curve of the relative error corresponding to the first randomly generated data changing with the number of Gaussian quadrature points, and the other curves have the same meaning. It can be observed from the left and right figures that the four curves are in good agreement when the Gaussian quadrature points (Gpnum) are greater than 9, which also demonstrates the convergence of the data. This also means that the precision of using nine Gaussian quadrature points almost reaches the accuracy of 15 Gaussian quadrature points. However, the higher the precision, the more Gaussian integrals are required. To strike a balance between the calculation precision and computational time, it is necessary to find a minimum number of Gaussian quadrature points satisfying E1. In the left figure, five Gaussian quadrature points are required to meet the given precision, whereas in the right figure six Gaussian quadrature points are needed.
To compensate for the randomness of insufficient data, the root mean square error is defined as follows:
where represents the root mean square error of the coefficient matrix G obtained using g Gaussian quadrature points in n pieces of data randomly generated. The number n indicates the number of cumulative terms, and max is also taken as 15. The meaning of is also the same.
Four cases, , and 80,000 are taken to obtain the root mean square error shown Fig. 5 as follows:
In Fig. 5, the horizontal axis represents the number of Gaussian quadrature points, and the vertical axis of the left and right figures represent the root mean square error calculated when 2 to 10 Gaussian quadrature points are used for data, respectively. denotes the root mean square error curve of 20,000 data points obtained using different Gaussian quadrature points from 2 to 10, and the other lines have the same meaning. It can be observed from the figure that the four lines of the left and right figures overlap, and the overall trend of the line decreases, indicating that the root mean square error converges. At the same time, it can also be observed from the figures that only six Gaussian quadrature points are needed to achieve the required precision E1, which is consistent with the conclusion obtained in Fig. 4. To reduce the calculation amount of the classification problem, the label range is finally changed from (2 to 10) to (2 to 6), and the values greater than 6 in (2 to 10) are all equal to 6. Then, 10000 training data are constructed using the above steps for later model training.
4.2 Model Training Phase
The 10,000 training data generated in the previous work were fed into the neural network model for training. The input layer receives the features, and the output layer outputs the corresponding Gaussian quadrature points that meet a certain precision.
The structure of the neural network is adjusted to obtain a relatively accurate model. In the process of accuracy verification using the separation function for cross-validation, we import 10000 data of which 8000 are used as training data and the remaining 2000 for testing. Each network structure is tested 10 times, and the accuracy is added and averaged 10 times to obtain the mean as presented in Table 1 (Two hidden layers are considered, and only the number of units in each hidden layer are changed).
As can be observed from Table 1, the network structure with two hidden layers, 300 units in the first hidden layer, and 100 units in the second hidden layer can reach accuracy. Therefore, this network structure is determined as the network structure required for later applications. The first hidden layer is the second layer in the network structure, and the determined network structure is shown in Fig. 6. To maintain the consistency of the dimension of the output vector and original label, which is transformed by a one − hot operation, we set the classification problem into five categories, followed by 2 to 6.
To observe the prediction effect of the model on the training data, 30 data points with fixed were randomly generated, and the Fig. 7 was obtained after eight predictions.
In Fig. 7, the horizontal axis represents the variation range of L1, and the vertical axis represents the number of Gaussian quadrature points corresponding to the change of L1. The blue thick solid line, marked as a star, represents the real number of Gaussian quadrature points (real label), and the other eight lines correspond to the results of eight predictions, respectively. As can be observed from Fig. 7, the number of Gaussian quadrature points increases with the increase of L1. The prediction results of the other eight lines are only inconsistent with the real value in the L1 range of 0.025 to 0.035, which accounts for of the overall interval. Therefore, it has little influence on the prediction results and proves the effectiveness of the prediction model.
Similarly, 30 pieces of fixed data are generated and randomly predicted eight times to obtain Fig. 8.
As shown in Fig. 8, the number of Gaussian quadrature points decreases with the increase of L2. Similar to Fig. 7, although the other eight lines do not coincide with the curves of real values, the overall trend is stable, which demonstrates that the model has good generalization ability. A small number of prediction errors occur in the range of L2 from 0.2 to 0.3 in Fig. 8, which reflects the inevitable randomness of the prediction.
4.3 The Time Required to Define the Optimal Model
Fig. 9 shows the CPU time required to determine the optimal model, which consists of the data preparation time in Section 4.1 and the model training and debugging time in Section 4.2. Define the total time as T, the data preparation time as T1, and the model training and debugging time as T2. Then T = T1 +T2. It can be seen that the time required to determine the optimal model increases linearly as the number of training data increases. In addition, the data preparation time is significantly greater than the training and debugging time of the model.
5 Application of the Two-Dimensional Potential Problem
To verify the accuracy of the model, the two-dimensional potential problem of a circular structure discretized by equal length and unequal length are taken as examples. The specific process of the application is presented in Fig. 10.
As shown in Fig. 10, the element information is stored in the input data. The lengths of L1 and L2 are calculated using the above-mentioned element information and determining whether L1 and L2 are consistent at this time with the feature range of the training data. Otherwise, the radius of the circle is adjusted. The calculated L1 and L2 are introduced into the optimal model as the test data to predict, and then the Gaussian quadrature points predicted previously are used to replace the Gaussian quadrature points in the traditional calculation to solve the Gaussian quadrature and subsequent equations. Finally, the numerical and analytical solutions on the boundary are calculated for comparison.
5.1 Discrete Circle Structure with Equal Length
As mentioned above, to control the feature parameters L1 and L2 within the feature range of the training data, the radius of the circle is set to 0.25, and the discrete circle structure with equal length is shown in Fig. 11.
In Fig. 11, the circle is discretized by 100 elements with equal length. Correspondingly, there are 100 collocation points (source points), located in the element center. Here, only the non-diagonal elements of the coefficient matrix are calculated; thus, a total of 9900 test data are obtained. The discrete element information is imported into the determined model as test data, and the comparison distribution between the predicted value and the real label is obtained, as presented in Table 2.
In Table 2, the number of coarsened data points on the diagonal represents the situation in which the real label and the predicted value are equal. There are 400 inaccurate data, as follows:
1. The true value is 2, and the predicted value is 3 (200 copies).
2. The true value is 4, and the predicted value is 5 (200 copies).
Therefore, the prediction accuracy of the model is . Although a small amount of data is not predicted correctly, the predicted value is only slightly larger than the real label. Compared with the use of 15 Gaussian quadrature points, the calculation time of the Gaussian integral can be reduced while meeting the given precision. The effectiveness of the model prediction is verified.
It is known that the potential function satisfying the Laplace equation is
It is assumed that the boundary condition is prescribed, which can be obtained by substituting the discrete collocation point coordinates into Eq. (21). The analytical solution of the flux at the collocation points are further calculated using Eq. (22):
where x1 and y1 denote the coordinates of the collocation point, and r denotes the radius of the circle.
It is worth noting that the numerical solution q1 of the boundary flux can be calculated by importing the predicted Gaussian quadrature points into the BEM code for computing. Finally, the numerical solution and the analytical solution of the flux are compared to obtain the relative error as presented in Table 3.
Here, 10 collocation points are selected on the boundary. As indicated in Table 3, the error between the numerical solution and the analytical solution of the flux at the collocation point is very small. Therefore, it is proved that the numerical solution obtained by using the minimum number of Gaussian quadrature points predicted by machine learning to perform Gaussian quadrature and the calculation of equations can almost replace the analytical solution. There is no need to use a sufficiently large number of the Gaussian quadrature points for calculation, which can not only reduce the calculation consumption of the Gaussian quadrature, but also achieve satisfactory accuracy.
5.2 Discrete Circle Structure with Unequal Length
Similarly, to control the feature parameters L1 and L2 within the feature range of the training data, a circle with radius of 0.25 is discretized with unequal length constant elements, and the discrete results obtained by making the center angle of the circle corresponding to each element within the range from to are shown in Fig. 12.
As depicted in Fig. 12, the circle is discretized into 47 elements and collocation points. Here, only the non-diagonal elements of the coefficient matrix are calculated, and thus a total of 2162 test data are obtained. The discrete parameter information of unequal length is imported into the determined model as test data, and the comparison of the distribution between the predicted value and the real label is made as presented in Table 4.
In Table 4, the number of coarsened data on the diagonal represents the situation in which the real label and the predicted value are equal. A total of 1997 data are predicted correctly; therefore the prediction accuracy of the model is . Most of the 165 incorrectly predicted values are 1 larger than the real label, which is a similar result to that presented in Table 2. Using the minimum number of Gaussian quadrature points that meet the accuracy or a slightly larger value can significantly reduce the computational complexity of the Gaussian integral. Therefore, this model can be used to predict new data.
It is also known that the potential function satisfying the Laplace equation is:
It is assumed that the boundary condition is known, which can be obtained by substituting the discrete collocation point coordinates into Eq. (23). The analytical solution of flux at the collocation poins are further calculated using Eq. (24):
where x2 and y2 denote the coordinates of the collocation point, and r denotes the radius of the circle.
In addition, numerical solution q2 can be obtained by importing the predicted number of Gaussian quadrature points into the BEM code for calculation. The relative error between the numerical solution and the analytical solution of the flux is compared as presented in Table 5.
Similarly, as can be observed in Table 5, 10 collocation points are selected on the boundary. The error between the numerical solution and the analytical solution of the flux at the collocation point is very small, which verifies the effectiveness of the machine learning prediction. More importantly, it can provide an efficient reference for subsequent Gaussian integral calculations.
5.3 Time Comparison of Traditional BEM and Machine Learning Accelerated BEM
In addition to accuracy, efficiency is another important criterion for the proposed method. Here, the radius of the circle is changed so that the discrete feature parameters L1 and L2 are all within the feature range of the training data. Afterward, the minimum number of Gaussian quadrature points satisfying E1 can be predicted by transferring the feature parameters into the trained model.
It is worth noting that the number of Gaussian quadrature points only affects the calculation process of the coefficient matrix. Fig. 13 shows the time consumption of calculating the coefficient matrix by traditional BEM (Case 1) and machine learning accelerated BEM (Case 2) at different degrees of freedom. Among them, 15 Gaussian quadrature points are used in Case 1 to calculate the coefficient matrix, and the minimum number of Gaussian quadrature points obtained by machine learning prediction are used in Case 2 to calculate the coefficient matrix.
As can be seen from Fig. 13, the time consumption in Case 1 increases exponentially as the degrees of freedom increase, but the time increase in Case 2 is more moderate. It is worth noting that, once trained, the proposed model can perform the calculation of the coefficient matrix efficiently. What’s more, the computational efficiency of the proposed model may be even more prominent when the size of the data is larger.
In this paper, the multi-classification algorithm of a neural network is used to predict the minimum number of Gaussian quadrature points that meet the given precision in the Gaussian integral calculation of the BEM. The accuracy of the prediction can reach approximately . Then, the predicted values obtained by the model are introduced into the BEM code to solve the numerical solution of the potential problem. After comparison with the analytical solution, it is found that the numerical solution can achieve high accuracy. More importantly, the use of predicted Gaussian quadrature points reduces the calculation time of the Gaussian integral to a certain extent, and also reduces the cost of the overall calculation of the BEM. Moreover, the high-order spline functions in the BEM further complicate this problem. Hence, the proposed technique is of great significance to guarantee the robustness, accuracy, and efficiency of BEM.
Funding Statement: The authors thank the financial support of National Natural Science Foundation of China (NSFC) under Grant (No. 11702238).
Conflicts of Interest: The authors declare that there are no conflicts of interest regarding the present study.
|This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.|