Open Access
ARTICLE
A Stochastic Ensemble Physics-Informed Neural Networks via Bagging and Monte Carlo Dropout
1 Laboratory for Artificial Intelligence, Institute for Computational Science and Artificial Intelligence, Van Lang University, Ho Chi Minh City, Vietnam
2 Faculty of Information Technology, Van Lang School of Technology, Van Lang University, Ho Chi Minh City, Vietnam
3 Department of Computer Science, University of York, York, UK
* Corresponding Author: Thao Nguyen-Trang. Email:
Computer Modeling in Engineering & Sciences 2026, 147(2), 33 https://doi.org/10.32604/cmes.2026.080808
Received 15 February 2026; Accepted 27 April 2026; Issue published 27 May 2026
Abstract
Solving differential equations (DEs), including ordinary differential equations (ODEs) and partial differential equations (PDEs), is fundamental to scientific computing and engineering. The development of deep learning has led to Physics-Informed Neural Networks (PINNs), in which physical laws are embedded directly into the loss function. However, PINNs inherit the intrinsic instability of deep neural networks (DNNs) and lack an effective mechanism for Uncertainty Quantification (UQ). This paper proposes a stochastic ensemble framework to address these limitations. The proposed method is a double-stochastic ensemble framework that combines bagging (via bootstrap resampling and randomized collocation points) with Monte Carlo dropout (MCDO) applied at inference time to provide consistent UQ. This double-stochastic design yields two main contributions. First, it reduces the intrinsic variance of the predicted solution and improves accuracy, particularly in small-data and/or noisy-data regimes. Second, the variance of the ensemble outputs serves as a reliable and computationally feasible proxy for the solution uncertainty. Numerical experiments on ODEs and nonlinear PDEs (e.g., Burgers’ equation) under impulse noise demonstrate that the proposed method outperforms standard baselines in predictive accuracy. Furthermore, the obtained Prediction Interval Coverage Probability (PICP) and Mean Prediction Interval Width (MPIW) confirm that our framework yields well-calibrated prediction intervals and effectively mitigates the severe performance degradation observed in single PINNs.Keywords
Mathematical models governing complex physical phenomena, ranging from wave propagation in fluid mechanics to simpler time-evolving systems, are commonly formulated in terms of differential equations (DEs) [1]. DEs are typically categorized into ordinary differential equations (ODEs) and partial differential equations (PDEs). Developing robust and accurate solvers for both ODEs and PDEs remains a cornerstone of scientific computing and engineering.
Historically, these DEs have been solved using classical numerical methods such as the finite element method (FEM) and the finite difference method (FDM) [2–5]. Despite their maturity, these approaches exhibit notable limitations, particularly when addressing the curse of dimensionality in high-dimensional problems, which can lead to exponential growth in computational cost [6,7]. Moreover, mesh generation for complex domains or time dependent boundary conditions remains a major practical challenge [8].
The rise of deep learning has enabled an alternative paradigm: physics-informed neural networks (PINNs). PINNs represent a principled integration of deductive reasoning from physical laws and inductive learning from data [6,9–12]. In PINNs, the approximate solution
Nevertheless, because PINNs are built upon DNNs, they inherit a major challenge from deep learning: model instability. Neural networks are often regarded as “unstable procedures,” meaning that small variations in weight initialization, the selection of collocation points (
In addition, measurement noise is inevitable in physical systems [9]. Estimating high-order derivatives (required to evaluate PDE residuals) from noisy data can substantially amplify errors in either numerical differentiation or automatic differentiation [16]. The numerical results reported in the aforementioned studies clearly illustrate this brittleness. A single PINN faces a coupled optimization challenge: it must simultaneously reduce data mismatch (under noisy observations) and the physical residual error. When observations are corrupted by noise, the optimizer may produce physically inconsistent solutions elsewhere in the domain in an attempt to fit noisy outliers.
Beyond these issues, standard PINNs typically provide only point estimates of the solution and lack an integrated mechanism for uncertainty quantification (UQ). The need for UQ has been recognized as an open research question since the foundational PINN literature. While probabilistic frameworks such as Bayesian PINNs (B-PINNs) have been proposed to handle noisy data and to infer parameter distributions, they often require more sophisticated computational pipelines, and systematic specification of priors remains a nontrivial open problem [17]. Conversely, applying simpler UQ techniques such as Monte Carlo dropout (MCDO) to a single PINN often yields overly conservative (overestimated) uncertainty intervals that provide limited practical insight into predictive reliability.
To improve the robustness and performance of PINNs, numerous research directions have been explored, including global optimization strategies, adaptive loss balancing (
Ensemble learning has also been investigated in the PINN context. For example, reference [9] proposed ensemble PINNs (ePINNs), which adopt a construction strategy by embedding multiple sub-networks in parallel within a main network. In this approach, the randomization dimensions arise entirely from the intentional initialization of specific patterns, allowing the sub-networks to specialize in learning these patterns. This implies that both the data and the collocation points are kept fixed throughout the study. Moreover, UQ has not been provided in this work yet. Similarly, Haitsiukevich and Ilin [15] used the median of individual neural networks as the final prediction. A distinctive aspect of this method is the expansion of collocation points based on the agreement among ensemble members. In this study, randomness comes only from the random initialization of neural network weights. In addition, the prediction variance is used merely as a threshold to guide the expansion of collocation points, rather than to provide a true prediction interval for the solution. It can be observed that existing ensemble PINN methods often rely on base learners that are either pre-trained neural networks or networks with randomly initialized weights. This reduces the diversity of the ensemble, as the models tend to converge to a limited number of local minima during training. In contrast, randomizing observations and collocation points provides genuinely different perspectives for the PINNs to learn from. Furthermore, when observations contain noise, random resampling can significantly reduce the effect of such noise. Finally, the provision of reliable UQ has not been adequately addressed in existing studies.
Therefore, a key research gap lies in the lack of a method that exploits classical statistical principles (bagging with bootstrap data and randomized collocation points, together with unbiased MCDO) to simultaneously mitigate intrinsic instability and provide reliable UQ, particularly for PDE problems in small-data or noisy-data regimes where vanilla PINNs remain challenging.
To fill this gap, this study proposes A Stochastic Ensemble Physics-Informed Neural Networks via Bagging and MCDO (Bagging-PINNs). The framework is designed as a double-stochastic ensemble that leverages the variance-reduction capability of bagging and the UQ capability of MCDO, with the aim of systematically addressing instability and improving the reliability of UQ in PINNs.
This simple but novel combination (bagging during training and MCDO during inference) reduces intrinsic variance and provides more reliable UQ than many more complex alternatives, while incurring minimal additional computational cost at inference. Moreover, our numerical studies indicate that the original PINN formulation, often referred to as vanilla PINNs, can perform worse than a purely data-driven DNN under noisy observations, whereas Bagging-PINNs preserve the benefits of physics constraints without becoming brittle in noisy data settings. While a single PINN may struggle to jointly optimize noisy data loss and physical residual loss, leading to physically inconsistent solutions, bagging alleviates this issue by averaging predictions across multiple submodels; the adverse effect of outliers present in one bootstrap sample is counteracted by other samples that do not contain them, thereby reducing variance and maintaining the benefits of the physics constraints.
After this introduction, Section 2 presents background on DNNs and PINNs. Section 3 describes the Bagging-PINNs architecture in detail. Section 4 reports numerical experiments and results. Finally, Section 5 summarizes the findings and discusses future research directions.
2.1 Artificial Neural Networks
Artificial Neural Networks (ANNs) are general function approximators that map inputs
where
using stochastic gradient descent (SGD) and its variants. The universal approximation theorem guarantees that
2.2 Physics-Informed Neural Networks
Physics-informed neural networks build upon DNNs by embedding physical constraints, enabling semi-supervised learning that does not rely solely on labeled data. Specifically, they incorporate the residual of a governing equation
where the sampling sets
The proposed method, referred to as Bagging-PINNs, is a double-stochastic ensemble framework designed to combine the strengths of PINNs with variance-reduction and uncertainty-quantification (UQ) techniques from ensemble learning.
3.1 Stochastic Ensemble Framework
Bagging-PINNs leverage two sources of stochasticity to construct M independent and diverse PINN models (
3.1.1 Bagging and Input Randomization
Bagging is implemented by randomizing both the training data and the physics sampling points for each submodel, encouraging different members to converge to different solutions within the large hypothesis space of neural networks:
• Bootstrap sampling: The
• Collocation-point randomization: Instead of bootstrapping, each submodel uses a newly and independently generated set of collocation points
• Furthermore, to ensure ensemble diversity, each base learner in the Bagging architecture is independently initialized using Kaiming Uniform initialization prior to bootstrap training.
In this work, MCDO is applied at inference (test) time to provide consistent UQ for every model. Specifically, for each sample in the T Monte Carlo samples, the dropout rate
It is important to note that Bagging randomness and MCDO randomness do not introduce redundant uncertainty, as they operate at different stages with complementary roles. Bootstrap re-sampling is applied during the training phase to enhance ensemble diversity and reduce the intrinsic variance of the model; whereas MCDO is applied only during the test phase to quantify the predictive uncertainty of the already trained models.
3.2 Training Procedure and Result Aggregation
Training consists of constructing and fitting M independent PINNs, followed by aggregating their predictions to obtain both the final solution estimate and uncertainty measures:
• Train M independent PINNs. Each network minimizes
• Compute the Bagging-PINNs prediction
The averaging operation in Eq. (4) is the key mechanism that reduces the instability of individual PINN models. The improvement is rooted in the statistical principle of variance reduction for unstable predictors, as introduced by [20]. In essence, neural networks are inherently unstable, and this issue becomes more severe in PINNs due to the highly non-convex loss landscape, as well as gradient pathologies arising from the conflict between data fitting and physical constraints [21,22]. By applying Bagging, the averaging mechanism ensures that random deviations and locally perturbed gradients from individual base models are effectively canceled out. From a probabilistic perspective, the Central Limit Theorem implies that the ensemble average of M independent trials converges to a normal distribution, where the prediction standard deviation is reduced by a factor of
Mathematically, the proposed framework can minimize the global error. According to Breiman’s error decomposition,
where
Regarding
Regarding
where
In summary, the proposed Bagging-PINN can increase
• Perform UQ via test-time MCDO using an average-then-aggregate rule: for each sample
and the prediction interval is reported as

Figure 1: The diagram for Bagging-PINN.
This section systematically analyzes and compares the performance of Bagging-PINN with several baseline models. The selected models include ANN, PINN, and a representative ensemble-PINN method that introduces randomness only through weight initialization, namely Init-Ensemble PINNs. We evaluate these methods on three benchmark problems, including an ODE, a PDE representing the wave equation, and a PDE representing the Burgers equation. All prediction errors as well as UQ metrics are computed on a fully noise-free high-resolution reference grid (e.g., a
The models are evaluated under different data conditions, including ideal noise-free data and data with impulse noise at varying noise levels and corruption rates, to provide a comprehensive view of the strengths and limitations of each approach. To simulate extreme outliers encountered in practice, the observational data (i.e., initial and boundary conditions) are corrupted using an impulse noise model. Specifically, a certain portion of the training data, determined by the Corruption Rate (CR), is randomly selected, and their target values are perturbed with an amplitude controlled by the Impulse Factor (IF). It is important to emphasize that this noise is applied only to the observational data. The collocation points, which are merely space–time coordinates
To ensure a fair comparison, all four models use the same network architectures and hyperparameters. Specifically, for the ODE problem, the network has 2 hidden layers with 64 neurons per layer. For the wave equation, the network is expanded to 3 hidden layers with 128 neurons per layer to handle the higher complexity. For the Burgers equation, a neural network with three hidden layers and 64 neurons per layer is used. The remaining hyper-parameters are summarized in Table 1. To evaluate deterministic prediction accuracy, we use standard metrics including Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE). In addition, the quality of UQ is evaluated based on the Prediction Interval Coverage Probability (PICP) and the Mean Prediction Interval Width (MPIW), which are calculated based on

4.1 Ordinary Differential Equation
This subsection simulates the process of drug concentration decay
As shown in Table 2, under noise-free conditions, both Bagging-PINN and Init-Ensemble PINNs outperform the pure ANN and the standard PINN in terms of deterministic accuracy (as reflected by MAE and RMSE). Moreover, Bagging-PINN provides highly reliable uncertainty estimates, as indicated by an MPIW of 0.4363, which is significantly narrower and more informative than that produced by ANN (MPIW = 1.2056) and PINN (MPIW = 1.7902). Compared to Init-Ensemble PINNs, Bagging-PINN is slightly worse in deterministic error but slightly better in uncertainty estimation.

Under impulse noise, the ANN model results in the large error with a MAE of 0.4462. Although the standard PINN still maintains physical constraints and keeps the error at a moderate level (MAE = 0.0625), its UQ performance is unreliable due to the large MPIW of 1.725. In contrast, Bagging-PINN not only achieves the best MAE (0.0609), but also maintains a sharp prediction interval (MPIW = 0.427) while ensuring full coverage of the data (see Fig. 2). The representative method, Init-Ensemble PINNs, also shows relatively good performance; however, its MAE (0.0627) remains higher than that of Bagging-PINN.

Figure 2: Predictions vs. the exact solution of Bagging-PINN on noise data.
We consider the one-dimensional wave equation
Under the noise-free condition, the results in Table 3 show that Init-Ensemble PINNs and Bagging-PINN achieve competitive performance, with MPIW values of 1.2106 and 1.2241, respectively. These methods also outperform the standard PINN. The advantage of Bagging-PINN becomes most evident when dealing with impulse noise

The stability of the method is further illustrated in Fig. 3. The spatio-temporal heatmap of the prediction interval width shows that Bagging-PINN maintains a narrow and uniform uncertainty band across the entire domain, even in regions near extreme impulse noise points (marked with crosses). Moreover, the 1D spatial cross-section at

Figure 3: Heatmap of prediction interval width and 1D spatial cross-section of Bagging-PINN.
4.3 Nonlinear Burgers’ Equation
To demonstrate the generalization capability of the proposed method for complex nonlinear problems, we apply the model to the one-dimensional Burgers equation. Specifically, the problem is defined as follows.
where the kinematic viscosity coefficient is set to
Despite the nonlinear nature of the equation, Bagging-PINN continues to outperform the other methods. In the noise-free case, Table 4 shows that Bagging-PINN achieves the lowest MAE (

In order to set an appropriate value of M, we conduct a sensitivity analysis of M with respect to model performance under noisy data across three considered equations.
The sensitivity analysis in Table 5 shows that the computational time increases approximately linearly with the ensemble size M. As M increases, the deterministic errors (MAE, RMSE) tend to decrease and then stabilize. In terms of UQ, when M is small (e.g.,

However, when M becomes too large (e.g.,

Figure 4: The variation of PICP and MPIW with respect to M on a logarithmic scale.
Based on the above analysis, to balance training time, predictive accuracy, and the quality of prediction intervals, the optimal ensemble sizes (highlighted in the figure) are selected as
In summary, by employing UQ metrics (PICP and MPIW), together with thorough comparisons against a traditional ensemble baseline (Init-Ensemble PINNs) on both linear and nonlinear equations, we have experimentally demonstrated that Bagging-PINN provides more stability and UQ capability, particularly in scenarios involving anomalous noisy data.
5 Conclusions and Future Works
This study introduced and evaluated Bagging-PINN, a double-stochastic ensemble framework for solving ODEs and PDEs. The numerical examples on both ODEs and nonlinear PDEs under extreme impulse noise demonstrate that Bagging-PINN effectively mitigates intrinsic instability, provides highly accurate predictions without overfitting to data outliers. Furthermore, the results on PICP and MPIW confirm that our optimized ensemble framework delivers sharp, well-calibrated prediction intervals, establishing it as a highly reliable and computationally feasible tool for real-world engineering problems.
However, the scalability of the method for higher-dimensional problems or more complex systems of equations remains a challenge because the computational cost of Bagging-PINN increases linearly with the ensemble size M. Nevertheless, since the base models are trained completely independently on different bootstrap datasets, the approach exhibits an embarrassingly parallel property. Therefore, this limitation also suggests a future research direction that leverages parallel computing to implement Bagging-PINN for three-dimensional (3D) problems and complex systems such as the Navier–Stokes equations.
Acknowledgement: Not applicable.
Funding Statement: The authors received no specific funding for this study.
Author Contributions: Conceptualization, Thao Nguyen-Trang; methodology, Thao Nguyen-Trang and Hiep Ha-Hoang; software, Hiep Ha-Hoang; validation, Thao Nguyen-Trang; formal analysis, Thao Nguyen-Trang; investigation, Thao Nguyen-Trang; resources, Hiep Ha-Hoang; data curation, Thao Nguyen-Trang and Hiep Ha-Hoang; writing—original draft preparation, Hiep Ha-Hoang; writing—review and editing, Thao Nguyen-Trang and Hiep Ha-Hoang; visualization, Thao Nguyen-Trang and Hiep Ha; supervision, Thao Nguyen-Trang. All authors reviewed and approved the final version of the manuscript.
Availability of Data and Materials: The source codes and data of the paper are available on GitHub under MIT licence at https://github.com/hhiep1504/SE-PINNs.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Raissi M, Perdikaris P, Karniadakis GE. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys. 2019;378:686–707. doi:10.1016/j.jcp.2018.10.045. [Google Scholar] [CrossRef]
2. Anderson JD. Computational fluid dynamics: the basics with applications. Vol. 206. Berlin/Heidelberg, Germany: Springer; 1995. [Google Scholar]
3. Liu WK, Li S, Park HS. Eighty years of the finite element method: birth, evolution, and future. Arch Comput Methods Eng. 2022;29(6):4431–53. doi:10.1007/s11831-022-09740-9. [Google Scholar] [CrossRef]
4. Ren Z, Zhou S, Liu D, Liu Q. Physics-informed neural networks: a review of methodological evolution, theoretical foundations, and interdisciplinary frontiers toward next-generation scientific computing. Appl Sci. 2025;15(14):8092. doi:10.3390/app15148092. [Google Scholar] [CrossRef]
5. Strang G, Fix GJ. An analysis of the finite element method. Vol. 212. Englewood Cliffs, NJ, USA: Prentice-Hall; 1973. [Google Scholar]
6. Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L. Physics-informed machine learning. Nat Rev Phys. 2021;3(6):422–40. doi:10.1038/s42254-021-00314-5. [Google Scholar] [CrossRef]
7. Quarteroni A, Valli A. Domain decomposition methods for partial differential equations. Oxford, UK: Oxford University Press; 1999. [Google Scholar]
8. Hughes TJ, Cottrell JA, Bazilevs Y. Isogeometric analysis: CAD, finite elements, NURBS, exact geometry and mesh refinement. Comput Methods Appl Mech Eng. 2005;194(39–41):4135–95. doi:10.1016/j.cma.2004.10.008. [Google Scholar] [CrossRef]
9. Aliakbari M, Soltany Sadrabadi M, Vadasz P, Arzani A. Ensemble physics informed neural networks: a framework to improve inverse transport modeling in heterogeneous domains. Phys Fluids. 2023;35(5):053616. doi:10.1063/5.0150016. [Google Scholar] [CrossRef]
10. Cuomo S, Di Cola VS, Giampaolo F, Rozza G, Raissi M, Piccialli F. Scientific machine learning through physics-informed neural networks: where we are and what’s next. J Sci Comput. 2022;92(3):88. doi:10.1007/s10915-022-01939-z. [Google Scholar] [CrossRef]
11. Liu Z, Archibong G, Waheed UB, Wang S, Song C. Physics-informed Fourier-DeepONet for a generalized eikonal solution. Comput Geosci. 2026;206(1):106026. doi:10.1016/j.cageo.2025.106026. [Google Scholar] [CrossRef]
12. Lu L, Pestourie R, Yao W, Wang Z, Verdugo F, Johnson SG. Physics-informed neural networks with hard constraints for inverse design. SIAM J Sci Comput. 2021;43(6):B1105–32. doi:10.1137/21M1397908. [Google Scholar] [CrossRef]
13. Mishra S, Molinaro R. Estimates on the generalization error of physics-informed neural networks for approximating a class of inverse problems for PDEs. IMA J Num Anal. 2022;42(2):981–1022. doi:10.1093/imanum/drab093. [Google Scholar] [CrossRef]
14. Luo K, Zhao J, Wang Y, Li J, Wen J, Liang J, et al. Physics-informed neural networks for PDE problems: a comprehensive review. Artif Intell Rev. 2025;58(10):1–43. doi:10.1007/s10462-025-11322-7. [Google Scholar] [CrossRef]
15. Haitsiukevich K, Ilin A. Improved training of physics-informed neural networks with model ensembles. In: 2023 International Joint Conference on Neural Networks (IJCNN). Piscataway, NJ, USA: IEEE; 2023. p. 1–8. doi:10.1109/IJCNN54540.2023.10191822. [Google Scholar] [CrossRef]
16. Srikitrungruang T, Aghaee Dabaghan Fard S, Lemon M, Lee J, Zhou Y. Robust physics-informed neural network approach for estimating heterogeneous elastic properties from noisy displacement data. Adv Sci. 2025;12(48):e08445. doi:10.1002/advs.202508445. [Google Scholar] [PubMed] [CrossRef]
17. Cai S, Mao Z, Wang Z, Yin M, Karniadakis GE. Physics-informed neural networks (PINNs) for fluid mechanics: a review. Acta Mech Sin. 2021;37(12):1727–38. doi:10.1007/s10409-021-01148-1. [Google Scholar] [CrossRef]
18. Le HG, Dao TP, Dang MP, Nguyen-Trang T. Behavior modeling for a new flexure-based mechanism by Hunger Game Search and physics-guided artificial neural network. Sci Rep. 2025;15(1):2015. doi:10.1038/s41598-025-85724-6. [Google Scholar] [PubMed] [CrossRef]
19. Wang Y, Zhong L. NAS-PINN: neural architecture search-guided physics-informed neural network for solving PDEs. J Comput Phys. 2024;496:112603. doi:10.1016/j.jcp.2023.112603. [Google Scholar] [CrossRef]
20. Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123–40. doi:10.1007/BF00058655. [Google Scholar] [CrossRef]
21. Krishnapriyan A, Gholami A, Zhe S, Kirby R, Mahoney MW. Characterizing possible failure modes in physics-informed neural networks. Adv Neural Inform Process Syst. 2021;34:26548–60. [Google Scholar]
22. Wang S, Teng Y, Perdikaris P. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM J Sci Comput. 2021;43(5):A3055–81. doi:10.1137/20m1318043. [Google Scholar] [CrossRef]
23. Gal Y, Ghahramani Z. Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning. London, UK: PMLR; 2016. p. 1050–9. [Google Scholar]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools