Open Access
ARTICLE
A Deep-Learning-Based Constitutive Method for Geomaterials Using a Neural Cutting Plane Algorithm
1 Key Laboratory of Ministry of Education for Geomechanics and Embankment Engineering, Hohai University, Nanjing, China
2 Research Institute of Geotechnical Engineering, Hohai University, Nanjing, China
3 Powerchina Huadong Engineering Corporation Limited, Hangzhou, China
* Corresponding Author: Qingxiang Meng. Email:
(This article belongs to the Special Issue: Advanced Computational Methods in Multiphysics Phenomena)
Computer Modeling in Engineering & Sciences 2026, 147(3), 8 https://doi.org/10.32604/cmes.2026.083227
Received 31 March 2026; Accepted 06 May 2026; Issue published 30 June 2026
Abstract
Constitutive modeling for geomaterials remains challenging because of limited data availability, strong nonlinearity, pressure sensitivity, and the non-smooth characteristics of commonly used yield surfaces. This study presents a deep-learning-based constitutive method for geomaterials that incorporates a neural stress-integration procedure based on the cutting plane algorithm (CPA). Two compact fully connected networks are trained to learn the yield function and its stress gradient from an augmented stress-state dataset. The trained networks are then incorporated into a cutting plane return-mapping procedure, in which only first-order information is required for the plastic stress return. This avoids explicit analytical yield expressions and second-derivative evaluations and is therefore more naturally compatible with non-smooth Mohr–Coulomb-type yield-surface representations in a first-order return-mapping sense. Numerical results show that the proposed method reproduces the reference Mohr–Coulomb response along the examined monotonic triaxial compression paths. Compared with the finite-difference closest-point projection method (CPPM) implementation considered in this study, the CPA-based neural stress-update procedure requires fewer network calls per update, indicating a more economical implementation for the present learned constitutive framework.Keywords
Supplementary Material
Supplementary Material FileConstitutive modeling of geomaterials remains challenging because soils and other geotechnical materials often exhibit strong nonlinearity, path dependence, anisotropy, pressure sensitivity, and strain softening, while available laboratory data usually provide only limited coverage of relevant stress states and loading paths. Accurately representing stress–strain responses across different initial conditions and environmental factors has therefore long relied on restrictive analytical assumptions and carefully designed numerical integration procedures [1–3].
Data-driven approaches provide an alternative by learning constitutive responses directly from curated experiments or numerical datasets. Early studies explored material informatics and neural approximators for constitutive representation [4–6], motivated in part by the universal approximation capability of neural networks [7]. More recent developments have incorporated physical structure to improve robustness and interpretability, including constitutive artificial neural networks with frame indifference, thermodynamic consistency, and polyconvexity for hyperelasticity [8], iCANN for inelastic behavior with pseudo-potential and thermodynamic constraints [9], and frameworks that jointly learn free energies, flow rules, and yield conditions [10]. These thermodynamics- or architecture-based approaches, such as TANN and iCANN-type models, aim to embed physical admissibility into the network formulation through free-energy functions, pseudo-potentials, internal variables, and evolution equations [11]. Another closely related line is level-set plasticity, where yield surfaces are represented as implicit neural functions and augmented signed-distance fields are used to provide off-surface training information [12,13]. In geotechnical applications, multi-fidelity learning and level-set-based deep learning frameworks have further demonstrated the potential of data-driven methods for representing complex soil behavior and constructing reproducible modeling pipelines [14,15]. Graph-network and learnable-physics-engine approaches have also recently been explored for geomaterials and solid-mechanics problems, but these methods usually operate closer to the solver or field scale [16].
Despite these advances, two practical challenges still limit the use of neural constitutive models in geomechanics. First, many existing models originate from metal plasticity and adopt pressure-insensitive yield criteria such as von Mises, which are not well suited to geomaterials whose mechanical response is strongly affected by confining pressure [17]. Second, many deep-learning-based constitutive frameworks embed the learned yield function in a closest-point projection method and rely on second-order derivative information. This becomes inconvenient for non-smooth or cornered yield surfaces typical of Mohr–Coulomb-type behavior and may reduce the robustness of stress integration [18,19].
To address these issues, this study develops a deep-learning-based constitutive method that separately learns the yield function and its stress gradient from a level-set-augmented stress-state dataset and incorporates them into classical return-mapping procedures. In this context, the main contribution of this work is the construction and assessment of a neural stress-integration framework for Mohr–Coulomb-type geomaterials. The framework learns the yield function and its stress-gradient information and uses them in a CPA-based first-order stress-update procedure. The focus is therefore on the use of learned yield-surface geometry in local plastic stress integration, rather than on discovering a complete thermodynamics-by-design constitutive architecture or replacing an analytical Mohr–Coulomb return mapping. The proposed method is assessed through triaxial compression simulations under different confining pressures, and its algorithmic characteristics are compared with those of the classical closest-point projection method. The results show that the proposed method reproduces the reference response along the examined monotonic triaxial compression paths and requires fewer network calls per update than the finite-difference CPPM implementation considered in this study.
The remainder of this paper is organized as follows. Section 2 presents the proposed methodology, including dataset preparation, neural network training, and the CPPM- and CPA-based stress-integration procedures. Section 3 describes the preparation of the Mohr–Coulomb benchmark dataset, the training settings of the two neural networks, and the triaxial-compression validation results. Section 4 discusses the computational characteristics of the proposed method and examines the effects of dataset range and network architecture on model performance. Finally, Section 5 summarizes the main conclusions, while the Supplementary Appendices provide additional derivations, pseudocode, and data-preparation details.
The method starts from raw stress data obtained either from experiments or from established yield criteria, followed by level-set augmentation to generate stress states located on and off the yield surface [12]. Two compact fully connected networks are then trained in PyTorch [20] to predict the yield function and its stress gradient. Consistent normalization is maintained throughout data preparation, network training, and inference. The trained networks are subsequently embedded into return-mapping algorithms for stress integration, including the cutting plane algorithm (CPA) and the closest-point projection method (CPPM). Fig. 1 summarizes the overall procedure, in which data collection and level-set augmentation constitute the data-preparation stage.

Figure 1: Overview of the proposed method.
To balance predictive capability and numerical robustness, the network architecture is kept compact. The proposed models consist of stacked Dense layers with optional element-wise Multiply blocks, and the layer width is treated as a tunable hyperparameter. The Rectified Linear Unit (ReLU) activation is adopted because of its simple form and computational efficiency in repeated stress-update evaluations [21]. For a hidden layer, the mapping is written as
where
Under the Sobolev training strategy introduced in Section 2.2.3, the network is required to approximate both the target function and its gradient. To improve representational flexibility, an element-wise Multiply layer is introduced,
where
Both the yield-function network and the stress-gradient network adopt the same fully connected architecture, as illustrated in Fig. 2. Here, WIDTH denotes the number of neurons in each hidden layer. The influence of network architecture is examined in Section 4.3. Other hyperparameters, such as the learning rate and optimizer, are not the main focus of this study. Standard settings are therefore adopted, while these parameters can be further tuned using common optimization tools such as Ray Tune [22].

Figure 2: Architectures of the yield-function network and the stress-gradient network.
2.2.2 Dataset Preparation and Augmentation
Consider a convex yield function
where
To improve data efficiency, material symmetry is exploited. For isotropic plasticity, the stress state is represented in the cylindrical coordinate system of the
Since yielded samples lie on the surface
where
where
The signed-distance representation satisfies the Eikonal equation [24],
which states the unit-gradient property near the interface. In cylindrical coordinates, Eq. (7) becomes:
In the actual data augmentation, no global Eikonal solver is used. Auxiliary samples are generated directly from each original on-surface point by scaling the
For more general hardening/softening cases, the level-set representation may be viewed as evolving with the internal variable
where
Let

Figure 3: Auxiliary samples generated by the level-set method.
2.2.3 Learning the Yield Function and Stress Gradient
After data augmentation, the original yield-surface samples are transformed into a signed-distance representation, giving the training dataset
Standard mean squared error (MSE) regression matches only function values and does not explicitly constrain derivatives [25]. To improve both function approximation and gradient consistency, Sobolev training is adopted [26]. A similar derivative-aware learning idea has also been used in graph-network-based learnable physics engines for crack propagation and coalescence, where Sobolev-type training improves the physical consistency and predictive stability of learned solid-mechanics models [27]. The loss function is written as
where
Sobolev training is introduced here to improve the local consistency between the learned level-set function and its gradient. However, it does not by itself provide a rigorous guarantee of global convexity or full thermodynamic admissibility. In the present framework, convexity is inherited from the convex benchmark used for supervised data generation, while stricter convexity-preserving and thermodynamic constraints remain a topic for future work.
Plastic return mapping requires the first derivative of the yield function. Under an associated flow rule, the plastic flow direction is given by [29]:
where
This stress-gradient network minimizes the discrepancy between the target stress gradient
2.3 Neural Network Inference and Stress Integration
2.3.1 Neural Network Inference in Python
After training, the network parameters are exported from PyTorch and loaded into the stress-update procedure implemented in Python. The layer topology, weights, and biases are read once before the loading-path simulation and remain fixed during the analysis. At each stress-update step, the input variables are normalized using the same procedure as in training, and forward inference is performed to evaluate the yield function and the stress gradient required by the return-mapping algorithm.
Bias vectors are stored in
This procedure is repeated layer by layer until the final network output is obtained. The predicted quantities are then passed to the subsequent stress-update algorithm. Appendix S2 in the Supplementary Material provides pseudocode for neural network inference within the stress-update procedure implemented in Python.
2.3.2 CPPM-Based Stress Update
For stress integration in the plastic regime, a deep-learning-based CPPM return-mapping scheme is adopted. Once the material enters plastic loading, the incremental constitutive update at step
Under the associated flow rule, the plastic flow direction is given by
If the trial state violates the yield condition, a plastic correction is required. In the CPPM framework, the trial stress is projected back toward the yield surface along the direction determined by the associated flow rule, which can be written as
In order to satisfy the consistency condition, it is necessary to solve the nonlinear equation for
From Eq. (14), the residual form
At the
where
where
where
After each local iteration, the updated stress
Both CPPM and CPA are iterative return-mapping schemes. In CPPM, the trial stress is corrected using the final gradient, whereas CPA updates the stress incrementally with the current gradient at each local iteration [30].
For non-smooth yield criteria such as Mohr–Coulomb, CPPM becomes more involved because second-order derivatives are difficult to define or evaluate robustly near corners or apex regions. By contrast, CPA only requires the current yield-function value and a first-order return direction, which makes it more compatible with a regularized level-set representation of cornered yield surfaces [31]. For perfectly plastic materials with piecewise linear yield surfaces, CPA and CPPM are equivalent in the returned stress and consistent tangent, indicating that CPA can recover the same local constitutive solution [18]. Analogously to Eq. (16), the stress update at the
where
and substituting Eq. (23) into Eq. (22) gives
For the scalar yield function
Combining Eq. (24) with the consistency condition
In CPA, the stress gradient
3.1 Training Dataset Preparation
The Mohr–Coulomb model [32] is adopted as the reference constitutive model in this study. Owing to its simple form and clear physical interpretation through cohesion and friction angle, it remains widely used in geotechnical analysis, especially for identifying failure modes, deformation trends, and stability characteristics. A well-known difficulty of the Mohr–Coulomb criterion is that its yield surface contains corners, which leads to non-smooth stress gradients. Unless otherwise stated, stresses are expressed using the tension-positive convention in this study; therefore, compressive stresses are negative. In principal stress space, assuming the algebraic ordering
where
To avoid repeated direct manipulation of principal stresses, the yield criterion is further expressed in invariant form using the standard formulation for isotropic yield functions [34]. Nayak and Zienkiewicz proposed a convenient set of stress-invariant definitions [35]:
where
By using the above stress invariants, the principal stress can be re-expressed as:
Substituting the expressions of
In this benchmark,

Figure 4: Data preparation and augmentation for neural network training: (a) original dataset, (b) augmented dataset, and (c) representative gradients of augmented samples. All stress components are plotted using the tension-positive convention; negative values correspond to compressive stress states.
In this study, the target stress gradient is obtained analytically from the Mohr–Coulomb model rather than by numerical differentiation of the discrete signed-distance labels, which is more efficient and avoids additional differentiation noise, especially for a large dataset of approximately
where
here,
the full derivation is provided in Appendix S6 in the Supplementary Materials.
The level-set method introduced in Section 2.2.2 is then used to augment the original dataset, leading to the complete training set
Using the classical Mohr–Coulomb model without strain softening as the reference, the proposed neural networks are trained to learn the yield function and its stress gradient. Accordingly, the sampled
During data generation, 100 sampling points
The sampling density, the minimum mean stress, and the range of accumulated plastic strain all influence the performance of the trained model in subsequent stress-integration simulations. In particular, if the absolute value of
Training is carried out using the NAdam optimizer [36] together with a ReduceLROnPlateau scheduler. The yield-function network is trained for 3000 epochs with a batch size of 128 and an initial learning rate of 0.0001, whereas the stress-gradient network is trained for 5000 epochs with a batch size of 256 and an initial learning rate of 0.001. Thus, the learning rate is adjusted during training rather than kept as a large fixed value. Both networks use 100 hidden neurons per layer. Fig. 5a–c reports the loss histories of the yield function network trained with Sobolev: (a) shows the value loss, (b) the gradient loss, and (c) their sum as the total loss. Fig. 5d shows the loss history of the stress-gradient network. The red curves in Fig. 5 denote raw, unsmoothed validation losses recorded during training rather than final test losses. No exponential moving average or moving-average smoothing is applied. Their stronger fluctuations mainly result from raw epoch-wise evaluation, the sensitivity of derivative-related Sobolev loss terms to local validation samples, and the logarithmic scale at very small loss levels.

Figure 5: Training histories of the two neural networks: (a–c) yield-function network and (d) stress-gradient network.
To evaluate the predictive performance of the trained neural networks, triaxial compression simulations were carried out at the material-point level under different confining pressures. Axial strain increments were prescribed step by step, while the lateral stresses were maintained at the target confining pressure through iterative adjustment of the lateral strain increments. Based on the network predictions, the stress state was updated incrementally along the prescribed loading path until the current step satisfied the required loading condition. Repeating this procedure over all loading increments yielded the complete triaxial stress–strain response under a given confining pressure.
Fig. 6 compares the analytical solution and the neural-network prediction of triaxial compression responses under confining pressures of 0, 5, and 10 MPa. The vertical axis represents the deviatoric stress

Figure 6: Triaxial compression response: reference MC vs. NN at different confining pressures.
This section compares the computational characteristics of CPA and CPPM, and investigates the effects of dataset range and network architecture, including depth, width, and Multiply layers, on method performance. Recent graph-network-based learnable physics engines have shown accurate and efficient forward prediction for OSB-PD Drucker–Prager elastoplastic boundary-value problems. Those studies mainly assess solver-level field prediction, whereas the present work focuses on material-point stress integration for a learned Mohr–Coulomb-type yield function. Therefore, the accuracy, computational efficiency, and convergence discussed below should be interpreted at the local stress-update level [16]. Robustness is evaluated using a material-point uniaxial compression path implemented in Python with axial strain increasing incrementally from 0.1% to 40%. The performance index is the maximum sustainable axial strain, defined as the first increment at which the constitutive update fails to converge or the predicted response becomes nonphysical. This metric provides a simple and consistent measure of numerical robustness.
4.1 Comparison of Deep Learning-Based CPA and CPPM
To compare the computational cost of a single stress update in the neural-network-based CPA and CPPM, the number of floating-point operations (FLOPs) is used as a complexity metric [37]. The CPPM considered here is the finite-difference implementation used in the present two-network framework. It should therefore be distinguished from an autograd-based CPPM implementation in which Hessian-related terms are computed directly from the yield-function network. For the yield-function network, one forward evaluation can be approximated as
In CPA, the yield-function network is first evaluated at the trial state. If
when only one local correction iteration is needed
By contrast, CPPM requires additional evaluations to approximate the second-order term. After the trial-state check, a local Newton iteration is performed to enforce the consistency condition. In each iteration, the yield-function network is evaluated for the residual, and the stress-gradient network is called for the return direction. To obtain the second-order term
with a single Newton iteration, this reduces to
To complement the theoretical complexity analysis, the computational efficiency of CPA and the finite-difference CPPM implementation was further evaluated in the triaxial-compression benchmark implemented in Python, while keeping the material parameters, loading path, number of loading steps, trained networks, and stopping tolerances unchanged and varying only the stress-update scheme. The timing test was conducted on a laptop computer equipped with an AMD Ryzen 9 6900HX CPU and 32 GB RAM, using Python 3.8 in CPU-only mode. Although GPU acceleration was used during neural-network training, the GPU was disabled during this stress-integration benchmark.
Fig. 7 shows that CPA requires about 2.0 s of wall-clock time and 0.8 s of CPU user time, whereas the finite-difference CPPM implementation requires about 4.0 and 3.2 s, respectively. These results agree with the call-count analysis for the present implementation. However, an autograd-based CPPM implementation may reduce the cost of the second-order terms; therefore, the reported speed difference should not be interpreted as a general advantage of CPA over all CPPM variants. A related issue is the consistency between the predicted yield function and the return direction. A single yield-function network with autograd gradients would provide exact consistency with the predicted yield function, whereas the present two-network design treats the stress-gradient network as a forward-inference approximation to the analytical reference gradient; a quantitative comparison between these two formulations is left for future work.

Figure 7: Runtime comparison of CPA and CPPM in the triaxial-compression benchmark implemented in Python.
It should also be noted that the purpose of this comparison is not to show that the neural implementation is faster than an analytical Mohr–Coulomb return mapping. For this simple reference model, an analytical return mapping is expected to be computationally cheaper; the Mohr–Coulomb model is used here as a transparent benchmark for evaluating the learned yield-function representation and the neural stress-update procedure.
4.2 Influence of Dataset Range
The approximation domain of the neural network is limited by the range of the training data. During dataset generation, this range is controlled by the sampling densities
For isotropic materials, performance depends more strongly on the coverage of the original dataset than on the absolute number of samples. The most influential bounds are the upper limit of accumulated plastic strain
Their influence is evaluated using a material-point uniaxial compression path implemented in Python. The performance index is the maximum converged axial strain


Figure 8: Effect of original dataset range on network performance.
Fig. 8 shows that the attainable axial strain depends on how well the loading path remains inside the original training domain. Increasing
In practical applications, the present model should be used only when the training dataset covers the expected ranges of mean stress and accumulated plastic strain along the target loading path, with a reasonable safety margin. Moderate extrapolation may still be tolerated, but both convergence and response quality deteriorate rapidly once the loading path moves too far outside the trained domain. Therefore, the current framework should be understood as a data-driven constitutive model with a domain-dependent applicability, rather than a fully general extrapolative constitutive model. As a first screening tool for boundary-value simulations, a range-based out-of-distribution (OOD) indicator can be defined for the current input vector
where
4.3 Influence of Network Architecture
Network architecture affects constitutive performance. Increasing depth and width improves representational capacity but may also increase training difficulty. ReLU is suitable for cornered yield surfaces such as Mohr–Coulomb, while the Multiply block enhances nonlinear coupling with limited added complexity.
To examine architectural effects, three designs were compared by varying depth, width, and the use of the Multiply block. Fig. 9 shows two variants relative to the baseline architecture in Fig. 2: in Fig. 9a, one Dense layer and one Multiply block are added to both the yield-function and stress-gradient networks, whereas Fig. 9b shows a Dense-only architecture without Multiply layers. All other hyperparameters were fixed, and hidden-layer widths of 50, 100, and 150 were tested; the settings are listed in Table 2.

Figure 9: Comparison of neural network architectures: (a) Dense + Multiply and (b) Dense only.

Performance is evaluated by the maximum converged axial strain
where

Because only three repetitions are available for each configuration, the run-to-run variability is reported using the min–max range rather than the coefficient of variation. The ranges are used only as a descriptive measure of run-to-run variability under different random initializations. At the tested widths, the baseline architecture gives higher mean
yielding
The enhanced architecture gives higher mean

Figure 10: Depth-performance curves of three neural constitutive model architectures.
Overall, increasing width tends to improve the mean performance for the tested architectures, while the benefit of additional Multiply layers is architecture- and training-setting dependent. For relatively compact networks, a small Dense + Multiply unit offers an efficient way to improve nonlinear representation without substantially increasing architectural complexity.
This study presents a deep-learning-based constitutive method for geomaterials, in which two compact neural networks are trained separately to represent the yield function and its stress gradient. Based on these predictions, a CPA-based neural stress-update procedure is constructed for stress integration. Following the classical first-order character of CPA, the proposed strategy avoids explicit evaluation of second-order derivatives and is more naturally compatible with pressure-sensitive and non-smooth yield criteria such as Mohr–Coulomb in a first-order return-mapping sense.
The results show that the proposed method can reproduce the response of the reference model along the examined monotonic triaxial compression paths and achieve lower computational cost than the finite-difference CPPM implementation considered in this study. The study also indicates that robustness depends strongly on the coverage of the training dataset and the choice of network architecture. Accordingly, the present method is most suitable for applications in which the relevant stress range and history-variable range can be estimated in advance and adequately represented in the training dataset. In particular, appropriate ranges of mean stress and accumulated plastic strain are important for stable predictions, while the benefit of Multiply layers is architecture- and training-setting dependent in the tested benchmark.
Future work will further examine unloading–reloading paths, stress paths traversing multiple deviatoric-plane sectors, true triaxial paths with varying Lode angle, and apex-region paths. Extensions to anisotropy, cyclic loading, evolving hardening/softening surfaces, and stronger physically informed constraints, including convexity-preserving and thermodynamic constraints, will also be considered.
Acknowledgement: None.
Funding Statement: This research was funded by the Postgraduate Research & Practice Innovation Program of Jiangsu Province, grant number SJCX25_0268 (Zijie He).
Author Contributions: The authors confirm contribution to the paper as follows: conceptualization, Qingxiang Meng and Zijie He; methodology, Zijie He and Qingxiang Meng; software, Zijie He; validation, Zijie He, Yajun Cao and Weijiang Chu; formal analysis, Zijie He; investigation, Zijie He; resources, Qingxiang Meng; data curation, Zijie He; writing—original draft preparation, Zijie He; writing—review and editing, Qingxiang Meng, Yajun Cao and Weijiang Chu; visualization, Zijie He; supervision, Qingxiang Meng; project administration, Qingxiang Meng; funding acquisition, Zijie He. All authors reviewed and approved the final version of the manuscript.
Availability of Data and Materials: The data supporting the findings of this study are available within the article and its Supplementary Materials. The implementation scripts used for data generation, neural-network training, and material-point stress-update calculations are not released as a public software package at this stage. The implementation scripts, trained network parameters, normalization parameters, and example input files are available from the corresponding author upon reasonable request.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest.
Supplementary Materials: The supplementary material is available online at https://www.techscience.com/doi/10.32604/cmes.2026.083227/s1. These materials include the coordinate transformation between cylindrical coordinates and principal stresses (Appendix S1), pseudocode for neural network inference (Appendix S2), the forward-difference scheme for second-order approximation in CPPM (Appendix S3), the complete CPPM stress-update procedure (Appendix S4), the complete CPA stress-update procedure (Appendix S5), the analytical derivation of the Mohr–Coulomb stress gradient (Appendix S6), and the overall data-preparation procedure (Appendix S7).
Abbreviations
| CPA | Cutting Plane Algorithm |
| CPPM | Closest-Point Projection Method |
| MSE | Mean Squared Error |
| ReLU | Rectified Linear Unit |
| NN | Neural Network |
References
1. Armero F, Pérez-Foguet A. On the formulation of closest-point projection algorithms in elastoplasticity—Part I: the variational structure. Int J Numer Methods Eng. 2002;53(2):297–329. doi:10.1002/nme.278. [Google Scholar] [CrossRef]
2. Piccolroaz A, Bigoni D. Yield criteria for quasibrittle and frictional materials: a generalization to surfaces with corners. Int J Solids Struct. 2009;46(20):3587–96. [Google Scholar]
3. Wood DM. Geotechnical modelling. Boca Raton, FL, USA: CRC Press; 2017. [Google Scholar]
4. Ghaboussi J, Sidarta DE. New nested adaptive neural networks (NANN) for constitutive modeling. Comput Geotech. 1998;22(1):29–52. doi:10.1016/s0266-352x(97)00034-7. [Google Scholar] [CrossRef]
5. Kalidindi SR, Niezgoda SR, Salem AA. Microstructure informatics using higher-order statistics and efficient data-mining protocols. Jom. 2011;63(4):34–41. doi:10.1007/s11837-011-0057-7. [Google Scholar] [CrossRef]
6. Mozaffar M, Bostanabad R, Chen W, Ehmann K, Cao J, Bessa M. Deep learning predicts path-dependent plasticity. Proc Natl Acad Sci. 2019;116(52):26414–20. doi:10.1073/pnas.1911815116. [Google Scholar] [PubMed] [CrossRef]
7. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989;2(5):359–66. doi:10.1016/0893-6080(89)90020-8. [Google Scholar] [CrossRef]
8. Linka K, Hillgärtner M, Abdolazizi KP, Aydin RC, Itskov M, Cyron CJ. Constitutive artificial neural networks: a fast and general approach to predictive data-driven constitutive modeling by deep learning. J Comput Phys. 2021;429:110010. [Google Scholar]
9. Holthusen H, Lamm L, Brepols T, Reese S, Kuhl E. Theory and implementation of inelastic constitutive artificial neural networks. Comput Methods Appl Mech Eng. 2024;428(3):117063. doi:10.1016/j.cma.2024.117063. [Google Scholar] [CrossRef]
10. Boes B, Simon J-W, Holthusen H. Accounting for plasticity: an extension of inelastic constitutive artificial neural networks. Eur J Mech-A/Solids. 2025;117:105998. [Google Scholar]
11. Masi F, Stefanou I. Multiscale modeling of inelastic materials with Thermodynamics-based Artificial Neural Networks (TANN). Comput Methods Appl Mech Eng. 2022;398(7):115190. doi:10.1016/j.cma.2022.115190. [Google Scholar] [CrossRef]
12. Vlassis NN, Sun W. Component-based machine learning paradigm for discovering rate-dependent and pressure-sensitive level-set plasticity models. J Appl Mech. 2022;89(2):021003. doi:10.1115/1.4052684. [Google Scholar] [CrossRef]
13. Suh HS, Kweon C, Lester B, Kramer S, Sun W. A publicly available PyTorch-ABAQUS UMAT deep-learning framework for level-set plasticity. Mech Mater. 2023;184(2):104682. doi:10.1016/j.mechmat.2023.104682. [Google Scholar] [CrossRef]
14. Sheil B, Anagnostopoulos C, Buckley R, Ciantia MO, Febrianto E, Fu JL, et al. Artificial intelligence transformations in geotechnics: progress, challenges and future enablers. Comput Geotech. 2026;189:107604. [Google Scholar]
15. Su M, Guo N, Yang Z. A multifidelity neural network (MFNN) for constitutive modeling of complex soil behaviors. Int J Numer Anal Methods Geomech. 2023;47(18):3269–89. doi:10.1002/nag.3620. [Google Scholar] [CrossRef]
16. Zhou X-P, Feng K. The novel learnable physics engines for interpretable elastoplastic models of geomaterials based on the message passing neural network. Int J Rock Mech Min. 2025;194(2):106244. doi:10.1016/j.ijrmms.2025.106244. [Google Scholar] [CrossRef]
17. Lubliner J. Plasticity theory. North Chelmsford, MA, USA: Courier Corporation; 2008. [Google Scholar]
18. Huang J, Griffiths D. Observations on return mapping algorithms for piecewise linear yield criteria. Int J Geomech. 2008;8(4):253–65. doi:10.1061/(asce)1532-3641(2008)8:4(253). [Google Scholar] [CrossRef]
19. Mazzucco G, Pomaro B, Salomoni VA, Majorana CE. Apex control within an elasto-plastic constitutive model for confined concretes. Math Comput Simul. 2019;162:221–32. [Google Scholar]
20. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, et al. Automatic differentiation in pytorch. 2017. [Google Scholar]
21. Bishop CM, Nasrabadi NM. Pattern recognition and machine learning. Berlin/Heidelberg, Germany: Springer; 2006. [Google Scholar]
22. Liaw R, Liang E, Nishihara R, Moritz P, Gonzalez JE, Stoica I. Tune: a research platform for distributed model selection and training. arXiv:1807.05118. 2018. [Google Scholar]
23. Borja RI. Plasticity. Berlin/Heidelberg, Germany: Springer; 2013. [Google Scholar]
24. Osher S, Sethian JA. Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulations. J Comput Phys. 1988;79(1):12–49. [Google Scholar]
25. Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning. Cambridge, UK: MIT Press; 2016. [Google Scholar]
26. Czarnecki WM, Osindero S, Jaderberg M, Swirszcz G, Pascanu R. Sobolev training for neural networks. Adv Neural Inf Process Syst. 2017;30:1–10. [Google Scholar]
27. Feng K, Zhou X-P. A novel graph networks based learnable physics engines for crack propagation and coalescence in solid mechanics. Eng Fract Mech. 2025;315:110800. doi:10.1016/j.engfracmech.2025.110800. [Google Scholar] [CrossRef]
28. Maclaurin D, Duvenaud D, Adams RP. Autograd: effortless gradients in numpy. In: Proceedings of the ICML, 2015 AutoML Workshop; 2015 Jul 11; Lille, France. [Google Scholar]
29. Simo J, Hughes T. Classical rate-independent plasticity and viscoplasticity. Comput Inelasticit. 1998:1–112. doi:10.1007/0-387-22763-6_2. [Google Scholar] [CrossRef]
30. Simo JC, Taylor RL. Consistent tangent operators for rate-independent elastoplasticity. Comput Methods Appl Mech Eng. 1985;48(1):101–18. [Google Scholar]
31. Huang J, Griffiths D. Return mapping algorithms and stress predictors for failure analysis in geomechanics. J Eng Mech. 2009;135(4):276–84. [Google Scholar]
32. Coulomb C-A. Essai sur une application des règles de maximis et minimis à quelques problèmes de statique, relatifs à l’architecture. Mémoires De Mathématique De L’académie R De Sci. 2023;175:1. doi:10.1051/geotech/2023019. [Google Scholar] [CrossRef]
33. Mohr O. Abhandlungen aus dem Gebiete der technischen Mechanik. Berlin, Germany: Ernst & Sohn; 1914. [Google Scholar]
34. Abbo A, Sloan S. A smooth hyperbolic approximation to the Mohr-Coulomb yield criterion. Comput Struct. 1995;54(3):427–41. doi:10.1016/0045-7949(94)00339-5. [Google Scholar] [CrossRef]
35. Nayak GC, Zienkiewicz OC. Convenient form of stress invariants for plasticity. J Struct Div. 1972;98(4):949–54. doi:10.1061/jsdeag.0003219. [Google Scholar] [CrossRef]
36. Dozat T. Incorporating nesterov momentum into adam. 2016. [Google Scholar]
37. Hennessy JL, Patterson DA. Computer architecture: a quantitative approach. Amsterdam, The Netherlands: Elsevier; 2011. [Google Scholar]
38. Barron AR. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory. 2002;39(3):930–45. doi:10.1109/18.256500. [Google Scholar] [CrossRef]
39. Anthony M, Bartlett PL. Neural network learning: theoretical foundations. Cambridge, UK: Cambridge university Press; 2009. [Google Scholar]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools