iconOpen Access

ARTICLE

Study on Flow and Heat Characteristics of Compressible Gas in a Supersonic Nozzle Based on PINNs with Sparse Data

Yida Shen1, Bin Dong2, Quan Ma1, Chao Dang1,*, Congjian Li2,*, Guojian Ren3, Shaozhan Wang1,2, Xiaozhe Sun1, Yong Ding4

1 Beijing Key Laboratory of Flow and Heat Transfer of Phase Changing in Micro and Small Scale, School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
2 High Speed Aerodynamics Institution, China Aerodynamics Research and Development Center, Mianyang, 621000, China
3 School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, 100044, China
4 School of Aerospace Engineering, Guizhou Institute of Technology, Guiyang, 550025, China

* Corresponding Authors: Chao Dang. Email: email; Congjian Li. Email: email

(This article belongs to the Special Issue: Advances in Microscale Fluid Flow, Heat Transfer, and Phase Change)

Frontiers in Heat and Mass Transfer 2026, 24(2), 7 https://doi.org/10.32604/fhmt.2025.077096

Abstract

This article explores the application of Physics-Informed Neural Networks (PINNs) in solving supersonic flow problems within a Laval nozzle, proposing innovative methods by integrating physical constraints and neural network optimization techniques. The main innovations of this study include the construction of a novel neural network architecture with shortcut connections to enhance the prediction of overall flow trends and local fluctuations, thereby improving convergence speed, reducing computational costs, and increasing the accuracy of flow field reconstruction. Additionally, this study designs a PINNs framework that incorporates specific physical knowledge (SPK) to improve model stability, generalization, and accuracy, even with sparse training data. A dynamic loss weighting strategy is employed to optimize training convergence, and velocity components are reformulated as magnitude and angle to simplify boundary conditions and reduce the dimensionality of the solution space. The results demonstrate that the proposed methods achieve satisfactory accuracy and robustness in solving supersonic problems, highlighting their potential application value.

Keywords

Physics-informed neural networks; sparse data; supersonic flow; specific physical knowledge; shortcut connections

1  Introduction

In recent years, the influence of machine learning (ML) and artificial intelligence (AI) on the field of fluid mechanics has become increasingly prominent. By constructing appropriate AI models, it has become possible to achieve improvements in measurement techniques and data augmentation, flow field control and prediction [1], simulation, and digital twins [2]. Additionally, such models have facilitated the optimization of aircraft design and the enhancement of aerodynamic performance [3].

Physics-Informed Neural Networks (PINNs), as a novel type of AI model, embed physical equations and boundary condition constraints. Consequently, compared to traditional neural networks, they are better at learning data features governed by known physical laws [4,5]. Beyond addressing the aforementioned research problems, PINNs are capable of solving both forward and inverse problems of various nonlinear partial differential equations [6] and achieving high-accuracy flow field reconstruction at different resolutions [7].

In research related to fluid control, Rabault et al. [8] applied artificial neural networks trained using deep reinforcement learning to the field of active flow control for regulating vortex shedding behavior. Fan et al. [9] utilized reinforcement learning to autonomously discover control strategies for drag reduction, particularly in the context of cylinder drag control in turbulence. These studies laid a solid foundation for subsequent research on complex flow field control.

In addressing multiphase flow and moving interface problems, PINNs have demonstrated excellent performance. Xu et al. [10] designed a neural network architecture suitable for solving interface velocity in multi-material Riemann problems and applied it to the simulation of compressible two-phase flows. Sarma et al. [11] proposed a novel interface PINN to address the limitations of traditional PINNs in effectively capturing physical characteristics at interfaces. Cheng et al. [12] introduced a physics-aware recurrent neural network to tackle flow problems in multiphase systems involving shock waves and chemical reactions.

In addressing the reconstruction and simulation of flow and thermal boundary layers, Bararnia and Esmaeilpour [13] extensively discussed the application of PINNs in solving nonlinear partial differential equations with unbounded boundary conditions. They achieved satisfactory results in solving convective heat transfer problems within boundary layers. Hao et al. [14] explored the application of deep operator networks in predicting the nonlinear evolution of instability waves in hypersonic boundary layers, demonstrating their effectiveness in providing rapid and accurate physics-informed predictions. Additionally, they investigated the capability of these networks to enhance limited observational data under extreme flow conditions.

In other fluid-related fields, Li et al. [15] systematically summarized the engineering processes and methodologies of ML in optimizing aircraft shape design. This indicates that the method of utilizing machine learning for predicting the external flow field of aircraft has become relatively mature. However, research on internal flows, especially those involving compressible behavior, remains relatively scarce. Zhu et al. [16] developed a PINN to enhance experimental data for the identification of three-dimensional vortices and turbulence prediction. Zheng et al. [17] proposed a parallel hard-constrained neural network incorporating first-order time derivatives to address non-Fourier problems in extreme transient heat conduction. Hu et al. [18] utilized PINNs for the numerical solution of primitive equations in large-scale oceanic and atmospheric dynamics. Wang et al. [19] integrated the soot radiation integral equation into a neural network architecture to simultaneously predict soot temperature and volume fraction fields in laminar flames. Yuan et al. [20] proposed a physics-informed convolutional neural network framework to solve partial differential equations for unlabeled data in spatiotemporal domains, addressing multiscale features and high-dimensional problems.

Moreover, numerous researchers constructed various neural network architectures and algorithms to enhance the performance and computational efficiency of PINNs [21]. Wang and Zhong [22] proposed a method for automatically searching for the optimal neural network architecture to solve given PDEs. To mitigate the high training cost and over-parameterization issues, some scholars introduced an adaptive learning neural network framework that continuously updated the training process to improve performance [23,24]. Nevertheless, this undoubtedly increases the complexity of the network and the cost of training. Jagtap et al. [25] proposed a time–space domain decomposition method, deploying deep neural networks in each subdomain. This approach enhanced the expressive power and significantly improved the computational efficiency of the model through parallel implementation.

One of the most critical phenomena in fluid dynamics is the supersonic phenomenon. Solving supersonic problems served as the foundation for the design and simulation of high-speed vehicles, and the core of analyzing supersonic fluid motion involved solving the Euler control equations [26]. Some researchers proposed employing adaptive artificial viscosity to facilitate convergence [27]. Guo et al. [28] combined PINNs with multiscale perception and residual learning to tackle the complex problem of reconstructing high-Mach-number inlet flow fields. Cao et al. [29] introduced a PINNs-based full-state-space solution model aimed at addressing inviscid airfoil flow problems under various flow conditions and geometries in the subsonic regime. Moreover, they proposed a grid transformation-based solution method to handle complex boundary conditions effectively. Mao et al. [30] proposed a novel deep learning neural network architecture to predict the coupled post shock flow and finite-rate chemical reactions in hypersonic flows. Ren et al. [31] significantly improved the predictive capability of PINNs in high-Reynolds-number compressible steady aerodynamic flows by introducing a sampling distance function, hard boundary conditions, and gradient weight factors into the loss function. Even with improvements in network architecture and algorithms, these methods still rely on training with large amounts of supervised data and collocation points.

From the above summary, the current application of PINNs in solving supersonic problems faces the following challenges:

1.    Due to the complexity of supersonic problems, even well-established simulation methods encountered significant difficulties in solving them. Consequently, research on using PINNs for these problems remains limited.

2.    Supersonic problems involve large variations in variables, often leading to discontinuities, which traditional PINNs struggle to capture. Thus, extensive data and large computational grids are required for accurate solutions.

3.    The existing solution algorithms and network structures were relatively simplistic, predominantly relying on traditional MLP or convolutional architectures. These designs did not fully consider the specific characteristics of supersonic problems, leading to low computational efficiency and high costs.

To address these issues, this study aimed to propose a novel neural network framework tailored to fluid dynamics problems. The proposed model sought to reduce computational costs while achieving satisfactory results in transonic and supersonic problems, even under sparse data conditions or in the complete absence of data.

This paper began with the problem setup, sequentially introduced the meaning of specific physical knowledge, constructed a neural network incorporating a shortcut structure, and explained how to integrate this knowledge into the solution logic. Finally, it compared the new method with the traditional approach.

2  Problem Setting

The main issue discussed in this paper is whether it is possible to construct a simple neural network that can still invert the complete flow field with only a few supervised data points. This method is meaningful in practical engineering, especially when the number of sensors on the equipment is small and the changes in physical quantities are large.

Considering the two-dimensional steady-state Euler equations for inviscid fluids, their dimensionless form was calculated as follows:

ρux+ρvy+uρx+vρy=0(1)

ρuux+ρvuy+Eupx=0(2)

ρuvx+ρvvy+Eupy=0(3)

ρ(uTx+vTy)+(γ1)p(ux+vy)=0(4)

where, Eu = p*/(ρmaxU2max) is the Euler number.

Assuming the fluid was an ideal gas, it satisfied the following ideal gas law equation:

p=ρT(5)

Herein, the stagnation parameters of the fluid were used to dimensionless various variables. The advantage of this approach was that it could better confine the output values of variables between 0 and 1, facilitating computation and output by neural networks. The specific methods for dimensionless transformation of each variable are shown in Table 1.

images

The above system of equations had a certain degree of universality and could be applicable to compressible gases with Mach numbers ranging from 0 to 3.

The Laval nozzle is a device capable of accelerating incoming airflow from subsonic to transonic and eventually to supersonic speeds. This nozzle is widely used in applications such as rocket engines, jet engines, and supersonic wind tunnels. Consequently, the Laval nozzle serves as an excellent example for solving the compressible Euler equations. Herein, we designed a flow model within the nozzle, as shown in Fig. 1, where a no-penetration boundary condition was applied along the nozzle walls.

Un=0(6)

images

Figure 1: Geometric structure diagram

3  Methodology

3.1 Shortcut Connections

When predicting variables in a flow field, researchers typically focused on the overall trend of the variables, determining whether they increased or decreased, and the local detailed characteristics by analyzing the deviation of the actual variable values from the general trend. The neural network proposed in this study incorporated a shortcut connection structure, which allowed it to extract the variation trend of the solution function while capturing its detailed fluctuations. To achieve this objective, an additional shortcut connection layer was introduced into the conventional fully connected neural network. The specific network structure is shown in Fig. 2.

images

Figure 2: Neural networks incorporating shortcut connections. (a) Trend of change; (b) Trends of correct and incorrect; (c) Local area change; (d) Local variations of correct and incorrect

In Fig. 2, the network 1 part conducted preliminary fitting on the input data, primarily predicting the trends of the solution function. The results obtained from network 1 were then split into branches. One part underwent further fitting through network 2, primarily predicting the detailed changing characteristics. Finally, the results of network 1 and network 2 were added together and output through an identity mapping layer to obtain the final prediction results of the network.

Mathematically, let the input vector be denoted by x, which is mapped by Network 1 to yield y1.

y1=NN1(x)

Then, along one branch y1 is further transformed by Network 2 into y2,

y2=NN2(y1)

whereas along the other branch y1 is preserved unchanged. The final output y is the plus of y1 and y2.

y=y1+y2=NN1(x)+NN2NN1(x)

Traditional neural networks typically employed a single type of activation function throughout the entire network. However, to better align with the functional characteristics of submodules within the network, the Gelu activation function was utilized in network 1, whereas the Tanh activation function was implemented in network 2. The mathematical expressions for both functions are calculated as follows:

Gelu(x)=x2(1+erf(x2))(7)

Tanh(x)=exexex+ex(8)

The Gelu activation function exhibited an approximately linear behavior when the input was greater than zero while approaching zero for negative inputs. Its derivative was smooth and differentiable everywhere, which contributed to mitigating issues such as gradient vanishing. Consequently, the aggregation of multiple Gelu functions facilitated effective prediction of solution trends. Conversely, the Tanh activation function, a nonlinear activation function with a range of [−1, 1], inherently possessed nonlinear characteristics that made it a robust tool for capturing local details.

The advantages of shortcut connections lie in enhancing the interpretability of the network structure, making the functions of each part of the network clearer. Additionally, they help improve the convergence and accuracy of the model. Moreover, they reduce the probability of the vanishing gradient caused by an overly deep network.

3.2 Specific Physical Knowledge Incorporated

When solving partial differential systems, to simplify governing equations, some assumptions were often introduced. These assumptions were only for specific physical problems and were more targeted compared with the original governing equations. The assumptions derived from specific physical problems and the new equations obtained through these assumptions were collectively referred to as specific physical knowledge.

Taking the flow in a Laval nozzle as an example, the following assumptions were usually made:

1.    The flow was inviscid and potential.

2.    Due to the relatively high flow velocity, the gas was considered to be adiabatic.

Integrating the Euler momentum equation along the streamline could yield the Bernoulli equation for gas.

cpT+12(u2+v2)=cpTmax(9)

The dimensionless form of this equation was calculated as follows:

T+12Ec(u2+v2)=1(10)

where Ec=Umax2/(cpTmax) represents the Eckert number. Since the flow was potential in Assumption 1, this equation was applicable to the entire flow field.

Moreover, based on the adiabatic characteristic in Assumption 2 and combined with the first law of thermodynamics, the differential form of the adiabatic process equation satisfied by the gas could be obtained as follows:

d(pργ)=0(11)

Taking the stagnation parameters as the integration reference points, we could obtain the following:

pργ=pmaxρmaxγ(12)

The dimensionless form of this equation was calculated as follows:

p=(ρ)γ(13)

To be precise, since it was previously assumed that the gas was an ideal gas, the ideal gas equation of state belonged to the specific physical knowledge. In this way, all the specific physical knowledge for solving problems regarding the Laval nozzle was obtained.

The traditional approach to solving problems with neural networks generally treated all unknown variables to be solved as output variables. These were then fed into the governing equations and boundary conditions to compute the errors, as shown in Fig. 3b. In contrast, the solving process of PINNs, which integrated specific physical knowledge, is shown in Fig. 3a, with its underlying logic shown in Fig. 3c. It was evident that the output variables of the neural network were not identical to all unknown variables. Instead, these variables were initially processed through a specific physical knowledge processor, which combined predefined physical knowledge to generate all unknown variables required for the solution.

images

Figure 3: (a) Solving process of PINNs integrated with specific physical knowledge. (b) Traditional solving approach. (c) Solving approach combined with SPK

Mathematically speaking, SPK reduces the dimension of the solution space by increasing the number of known equations. For this problem, introducing SPK can reduce the output of the neural network from 5 dimensions to 3 dimensions, greatly compressing the solution space.

The expression for the loss function was given as shown in Eq. (14) as follows:

Loss=β1MSEdata+β2MSEPDE+β3MSEBC+β4MSESPK(14)

where βi represents the weight coefficients for each term, and MSESPK denotes the error term arising from specific physical knowledge. For solving the Laval nozzle problem, its expression was given as follows:

MSESPK=1N[p(ρ)γ]Points in field2(15)

Moreover, MSEPDE represents the error term associated with the more general Euler governing equations, and its specific form is given as follows:

MSEPDE=R1+R2+R3+R4(16)

R1=1N(ρux+ρvy+uρx+vρy)Points in the field2(17)

R2=1N(ρuux+ρvuy+Eupx)Points in the field2(18)

R3=1N(ρuvx+ρvvy+Eupy)Points in the field2(19)

R4=1N[ρ(uTx+vTy)+(γ1)p(ux+vy)]Points in the field2(20)

As for MSEBC represents the error term associated with boundary conditions. Tts expression was given as follows:

MSEBC=1N[φ(PINNs)φ(Set)]Points on the boundary2(21)

where φ(PINNs) was the value of the boundary points output by the PINNs; φ(Set) was the desired setpoint that is expected to be achieved.

In fact, the construction of specific physical knowledge was equivalent to adding additional constraints to the loss function, which included a strong constraint and a weak constraint. The strong constraint involved directly enforcing certain relationships on the variables within the specific physical knowledge processor, whereas the weak constraint applied to some specific physical knowledge equations, which were directly incorporated into the loss function as terms to be optimized. The advantage of this approach was that, during training, it maintained the generality of targeting the Euler equations while enabling targeted quantitative solutions based on the assumptions, thereby improving the stability and accuracy of the training process. Fig. 4 shows a detailed illustration of the neural network framework integrating specific physical knowledge.

images

Figure 4: PINNs framework with specific physical knowledge incorporated

3.3 Dynamic Weights of Losses

The convergence of neural network training was closely related to the choice of the loss function and the form of the equations being trained. Essentially, the training process of PINNs represented a multi-objective optimization problem. The relative importance of each objective was determined by the associated weight coefficients. However, standard PINNs often encountered pathological gradient issues during training, which led to difficulties in convergence and limited accuracy. To dynamically balance the loss terms during training and alleviate gradient-related problems in backpropagation, a dynamic weighting scheme in the following form was designed:

βi=λiσi+1(22)

In this formulation, λi represents a constant, and σi denotes the variance of the errors for all data points in each term. Placing the variance in the denominator is based on theoretical considerations. When the error distribution of a certain variable is extremely uneven, it will cause the value of the loss function of this variable to be too large, which in turn leads to difficulties in convergence. Therefore, placing the variance in the denominator can effectively suppress this phenomenon. Since the variance term could be zero or extremely small when dealing with sparse data, adding one to the denominator was necessary to ensure the validity of the weight coefficients.

3.4 Variable Substitution

To enhance the convergence and stability of the training process, the output variables were reformulated from the two components of velocity (u,v) into the magnitude of velocity and the angle between the resultant velocity and its horizontal component (|U|,α).

As shown in Fig. 3, the neural network did not directly output the two components of the velocity. Instead, this network produced the magnitude of the velocity |U| and the angle α between the velocity vector and its horizontal component. The advantage of this approach was a reduction in the number of variables involved in the Bernoulli equation, as illustrated by the following expression:

T=112Ec(u2+v2)T=112Ec|U|2

This method allowed the solution search space to be reduced from the three-dimensional domain T=f(u,v) to the two-dimensional plane T=g(|U|), which significantly enhanced optimization efficiency.

Furthermore, introducing the angle between the resultant velocity and its horizontal component enabled the boundary conditions to be simplified into a more straightforward form. Once the gradient value k at the boundary was determined, the no-penetration boundary condition could be reformulated as shown in Eq. (23).

k=uvα=arctan(k)(23)

This reduced the dimensionality of the search space, further improving the optimization efficiency.

4  Results and Discussion

4.1 Accuracy Verification

In terms of the solution configuration, all the training in this paper was completed using MATLAB programming, with the L-BFGS optimizer. A total of ten thousand collocation points were used for training, while the supervised data points included only the points at the sensor locations and the pressure points at the outlet. The neural network used consists of 8 layers, with Subnetwork 1 having 5 layers and Subnetwork 2 having 3 layers, each layer using 128 neurons.

Before delving into a detailed discussion of the different results, the accuracy of the outcomes obtained using PINNs was precisely validated. The results of the computational fluid dynamics (CFD) model using the finite element method were compared with those calculated using the PINNs.

As shown in Table 2, the average values of variables at the throat and outlet cross-sectional positions were selected for comparison. The maximum relative error for all variables did not exceed 4%, suggesting that the predicted results were satisfactory from an engineering perspective.

images

Further comparisons were made regarding the distribution of variables along the axis, as shown in Fig. 5. The theoretical model presented was based on calculations from a quasi-one-dimensional Laval nozzle model with adiabatic ideal fluid. The distributions of the variables along the axis aligned well with the theoretical values, particularly for temperature and velocity. Although the distributions of density and pressure exhibited deviations at certain locations, the maximum error along the axis remained within approximately 5%. This demonstrates that the accuracy of the training results is sufficiently satisfactory.

images

Figure 5: Distribution of different variables along the axis: (a) Density, (b) Pressure, (c) Temperature, and (d) Velocity

The simulation results were ultimately compared with the cloud plot obtained from the PINN calculations, as shown in Figs. 6 and 7. Despite slight differences in the distribution of the color bar in the cloud plots, both results align remarkably well in terms of the overall trend and the values at key locations.

images

Figure 6: Results of simulation (Top left) Temperature (Top right) Pressure (bottom left) Velocity Magnitude (bottom right) Density

images

Figure 7: Results of PINNs (Top left) Temperature (Top right) Pressure (bottom left) Velocity Magnitude (bottom right) Density

4.2 Impact of Different Supervised Data Points on the Results

In practical engineering applications, the installation of sensors in nozzles to acquire flow information is indispensable. Moreover, the data collected by these sensors can serve as the foundation for training PINNs. However, an excessive number of sensors may disrupt the internal flow field of the nozzle and increase construction costs. Consequently, the quantity and type of sensors to be installed are critical concerns in engineering. Based on the aforementioned context, it was assumed that either temperature or pressure sensors were installed in the nozzle, with two cross-sectional positions available for installation, as shown in Fig. 8.

images

Figure 8: Schematic diagram of the sensor installation positions

Considering the use of different sensor combinations, four operating conditions were designed, as shown in Table 3. Activating these sensors allowed us to obtain ground truth values from these locations, which we used as the data-driven component for training the PINNs.

images

To eliminate the influence of different initial parameters of the neural network on the results, a pretraining process of 1000 steps was initially conducted. The weights of the neural network after pretraining were then used as the initial weights for training the four operating conditions mentioned above. The results of the pretraining process are shown in Fig. 9.

images

Figure 9: Comparison between pretrained results and theoretical values. (a) Density; (b) Pressure; (c) Temperature; (d) Speed

Due to the extreme sparsity of input data, only 500 steps were used for training under each operating condition to avoid overfitting, resulting in a total of 1500 iterations per condition. Initially, the directly observed pressure and temperature variables were analyzed. The post-training axial distributions and error distributions were plotted, as shown in Figs. 10 and 11.

images

Figure 10: Temperature distribution curves under different operating conditions. (a) Temperature curve; (b) Error distributions under different working conditions

images

Figure 11: Pressure distribution curves under different operating conditions. (a) Pressure curve; (b) Error distributions under different working conditions

In general, the prediction accuracy for temperature was higher than that for pressure. When the corresponding sensors were activated, the predicted values at the respective locations aligned perfectly with the ground truth, demonstrating the effectiveness of the input data. Additionally, the auxiliary training with sensor data reduced the errors for both variables. Furthermore, it was observed that increasing the number of pressure sensors played a crucial role in improving prediction accuracy. When two pressure sensors were installed, only a few points exhibited a maximum relative error of 7%. In contrast, even with two temperature sensors installed, the prediction accuracy remained inferior to that achieved with only one pressure sensor.

Subsequently, the influence of sensor configuration on density and velocity was discussed, as shown in Figs. 12 and 13.

images

Figure 12: Velocity distribution curves under different operating conditions. (a) Speed curve; (b) Error distributions under different working conditions

images

Figure 13: Density distribution curves under different operating conditions. (a) Density curve; (b) Error distributions under different working conditions

4.3 Ablation Experiments and Performance Analysis

To quantify the improvement of each technical module on the model performance, an ablation experiment was conducted. One innovative component was removed sequentially from the most complete model that included all innovative components. The specific design of the ablation experiment scheme is as follows.

•   For parametric refinements (e.g., dynamic weighting, variable substitution) that do not alter the network architecture, comparisons were made using a fixed random seed. This ensures that any performance difference is solely attributable to the parameter change itself, providing a precise measure of its incremental effect.

•   For structural modifications (e.g., adding the SPK processor or shortcut connections) that change the model’s capacity, experiments were repeated three times with different random seeds. This allows us to report the mean and standard deviation, accounting for the variability introduced by initialization and assessing the robustness of the architectural innovation.

For the convenience of numbering and description, the original model is denoted as “0”; the shortcut model with a sub-network introduced is denoted as “A”; the model with an SPK processor introduced is denoted as “B”; the model with dynamic weights introduced is denoted as “C”; the model with variable reconstruction introduced is denoted as “D”.

The L-BFGS training method was adopted, and all ablation experiments were iterated 1000 times. The CPU used was an i7-14650HX with a total of 32 GB of memory, and a single NVIDIA RTX4060 was used as the GPU. The average single—training time was 20 min.

Fig. 14 compares the convergence curves of different models. The most complete model completed the convergence ahead of schedule, and the value of the loss function at convergence was lower, which proves the effectiveness of multi-component fusion. As the components are continuously reduced, it becomes more and more difficult for the model to converge, which is manifested by the fact that the value of the loss function remains at a relatively high level and shows no change. Moreover, after removing the dynamic weights and variable reconstruction, it can be seen that the model is more sensitive to the initial network parameters, and there are large deviations in the results trained with different initial networks.

images images

Figure 14: Loss value of different models. (a) Comparison between A + B + C + D and A + B + C; (b) Comparison between A + B + C and A + B; (c) Comparison between A + B and A; (d) Comparison between A and 0

Fig. 15 compares the initial and final loss function values of different models. For the “0” model, it is most significantly affected by the initial network parameters, and the loss function value after the same number of iterations is also the highest. The shortcut model with a sub-network can reduce the initial loss function value by 24.86% and increase its stability, but it has a relatively small impact on the error of the final iteration, only reducing it by 2.25%. Variable reconstruction can reduce the final loss function value by 51.19%, indicating that it has a large auxiliary convergence effect.

images

Figure 15: Initial and final loss value of different models

Fig. 16 compares the variable results predicted by different models. The most comprehensive model shows the best performance after 500 iterations. The shortcut structure of the sub-network can better capture the changing trends, which is consistent with the previous theoretical analysis. When the SPK processor is introduced, the prediction results change significantly. The overall trend is almost consistent with the theoretical value, with only quantitative differences, which proves the effectiveness of SPK.

images

Figure 16: Comparison of predictive variables among different models. (a) Density; (b) Pressure; (c) Temperature; (d) Speed

Fig. 17 compares the average errors of the prediction variables of different models and calculates the change rate of the average errors of all variables with the increase of components.

images

Figure 17: Comparison of errors of prediction variables among different models

In terms of the results, SPK and variable reconstruction contribute more significantly to the accuracy. The average errors of the predicted variables are reduced by 93.94% and 89.81%, respectively. Meanwhile, the introduction of SPK also significantly reduces the uncertainty of the errors, which proves the effectiveness of the two methods.

To further verify the effectiveness of variable reconstruction under different initial network parameters, we compared the final loss function values of neural networks with different initial parameters under the same number of iterations. A total of 10 repeated experiments were conducted. The final comparison data are shown in Table 4.

images

The data presented in the table demonstrated that, among the four comparative metrics, the neural network with angular output consistently outperformed the one with component-based output. The average value of the initial loss function during training decreased by approximately 51.1%, while its standard deviation was reduced by 49.8%. A smaller initial loss function value indicated that the angular output facilitated the neural network in finding suitable initial parameters more effectively, while the reduced standard deviation suggested that this effect was relatively stable. Furthermore, the average value of the final loss function decreased by approximately 52.8%, and its standard deviation dropped by 72.2%. A lower final loss function value implied that the angular output improved convergence and enhanced accuracy, while the significantly smaller standard deviation indicated the robustness of this approach, as it was less influenced by the initial parameters of the neural network.

It should be noted here that without the SPK structure, convergence cannot be guaranteed in all training scenarios. Therefore, the absence of the SPK structure has not been included as a comparison in this paper.

We also compared the new network architecture with the traditional fully connected architecture. As before, the focus of the comparison was on the values of the loss function under the same number of iterations. Each architecture was subjected to 10 repeated experiments. The results are shown in Table 5.

images

The data in the table revealed that the neural network incorporating a shortcut structure outperformed traditional neural networks in the initial and final loss function values. The average initial loss function value decreased by 5.6%, while the average final loss function value decreased by 11.1%. This indicated that the shortcut structure effectively facilitated the neural network in finding suitable initial parameters more efficiently and achieving smaller errors more quickly during training. In summary, the shortcut structure proved to enhance the convergence and accuracy of the training process.

5  Conclusions

This study explores the application of PINNs in solving supersonic flow problems, particularly within the Laval nozzle. By integrating physical constraints and neural network optimization techniques, the proposed framework enhances computational efficiency and accuracy. The key innovations of this study are as follows:

1. Development of a bypass-structured physics-informed neural network for complex fluid flow solution.

A novel neural network structure incorporating shortcut connections is proposed to improve the prediction of global flow trends and local fluctuations. This approach enhances convergence speed, reduces computational cost, and increases the accuracy of flow field reconstructions, making PINNs more effective in complex aerodynamic environments.

2. Construction of a physics-informed neural network with shortcut connections for solving complex fluid flow problems under sparse data conditions.

This study constructs a novel physics-informed neural network framework that integrates specific physical knowledge. Compared with traditional PINNs, the incorporation of specific physical knowledge allows for customized solutions to particular physical problems without compromising generalizability, thereby enhancing the stability and accuracy of the model. Notably, this approach remains effective even when only sparse data are provided, consisting merely of information at the inlets and outlets as well as a few discrete points.

3. Optimization of training convergence via dynamic loss weighting and variable reformulation

A dynamic loss weighting strategy is implemented to balance different loss terms during training, addressing gradient pathologies and improving convergence efficiency. Additionally, the reformulation of velocity components into magnitude and angle simplifies boundary conditions, reduces solution space dimensionality, and enhances the robustness of the model in supersonic flow simulations.

Acknowledgement: Not applicable.

Funding Statement: This research was supported by the Guizhou Provincial Major Scientific and Technological Program (XKBF (2025) 031). This research was supported by the Fundamental Research Funds for the Central Universities (No. 2024JBMC016).

Author Contributions: The authors confirm contribution to the paper as follows: Writing—original draft preparation and formal analysis: Yida Shen; writing—review and editing: Bin Dong; software and visualization: Quan Ma; funding acquisition, supervision and conceptualization: Chao Dang; supervision, formal analysis and methodology: Congjian Li; methodology: Guojian Ren; data curation: Shaozhan Wang, Xiaozhe Sun; funding acquisition and conceptualization: Yong Ding. All authors reviewed the results and approved the final version of the manuscript.

Availability of Data and Materials: The data that support the findings of this study are available from the corresponding authors upon reasonable request.

Ethics Approval: Not applicable.

Conflicts of Interest: The authors declare no conflicts of interest to report regarding the present study.

References

1. Brunton SL, Noack BR, Koumoutsakos P. Machine learning for fluid mechanics. Annu Rev Fluid Mech. 2020;52:477–508. doi:10.1146/annurev-fluid-010719-060214. [Google Scholar] [CrossRef]

2. Vinuesa R, Brunton SL, McKeon BJ. The transformative potential of machine learning for experiments in fluid mechanics. Nat Rev Phys. 2023;5(9):536–45. doi:10.1038/s42254-023-00622-y. [Google Scholar] [CrossRef]

3. Le Clainche S, Ferrer E, Gibson S, Cross E, Parente A, Vinuesa R. Improving aircraft performance using machine learning: a review. Aerosp Sci Technol. 2023;138:108354. doi:10.1016/j.ast.2023.108354. [Google Scholar] [CrossRef]

4. Wang H, Cao Y, Huang Z, Liu Y, Hu P, Luo X, et al. Recent advances on machine learning for computational fluid dynamics: a survey. arXiv:2408.12171. 2024. [Google Scholar]

5. Cuomo S, Di Cola VS, Giampaolo F, Rozza G, Raissi M, Piccialli F. Scientific machine learning through physics-informed neural networks: where we are and what’s next. J Sci Comput. 2022;92(3):88. doi:10.1007/s10915-022-01939-z. [Google Scholar] [CrossRef]

6. Raissi M, Perdikaris P, Karniadakis GE. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys. 2019;378:686–707. doi:10.1016/j.jcp.2018.10.045. [Google Scholar] [CrossRef]

7. Yu C, Bi X, Fan Y. Deep learning for fluid velocity field estimation: a review. Ocean Eng. 2023;271:113693. doi:10.1016/j.oceaneng.2023.113693. [Google Scholar] [CrossRef]

8. Rabault J, Kuchta M, Jensen A, Reglade U, Cerardi N. Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control. arXiv:1808.07664. 2018. [Google Scholar]

9. Fan D, Yang L, Wang Z, Triantafyllou MS, Karniadakis GE. Reinforcement learning for bluff body active flow control in experiments and simulations. Proc Natl Acad Sci U S A. 2020;117(42):26091–8. doi:10.1073/pnas.2004939117. [Google Scholar] [PubMed] [CrossRef]

10. Xu L, Liu Z, Feng Y, Liu T. Physics-constrained neural networks as multi-material Riemann solvers for compressible two-gas simulations. J Comput Sci. 2024;78:102261. doi:10.1016/j.jocs.2024.102261. [Google Scholar] [CrossRef]

11. Sarma AK, Roy S, Annavarapu C, Roy P, Jagannathan S. Interface PINNs (I-PINNsa physics-informed neural networks framework for interface problems. Comput Meth Appl Mech Eng. 2024;429:117135. doi:10.1016/j.cma.2024.117135. [Google Scholar] [CrossRef]

12. Cheng X, Nguyen PCH, Seshadri PK, Verma M, Gray ZJ, Beerman JT, et al. Physics-aware recurrent convolutional neural networks for modeling multiphase compressible flows. Int J Multiph Flow. 2024;177:104877. doi:10.1016/j.ijmultiphaseflow.2024.104877. [Google Scholar] [CrossRef]

13. Bararnia H, Esmaeilpour M. On the application of physics informed neural networks (PINN) to solve boundary layer thermal-fluid problems. Int Commun Heat Mass Transf. 2022;132:105890. doi:10.1016/j.icheatmasstransfer.2022.105890. [Google Scholar] [CrossRef]

14. Hao Y, Di Leoni P, Marxen O, Meneveau C, Karniadakis GE, Zaki TA. Instability-wave prediction in hypersonic boundary layers with physics-informed neural operators. J Comput Sci. 2023;73:102120. doi:10.1016/j.jocs.2023.102120. [Google Scholar] [CrossRef]

15. Li J, Du X, Martins JRRA. Machine learning in aerodynamic shape optimization. Prog Aerosp Sci. 2022;134:100849. doi:10.1016/j.paerosci.2022.100849. [Google Scholar] [CrossRef]

16. Zhu L, Jiang X, Lefauve A, Kerswell RR, Linden PF. New insights into experimental stratified flows obtained through physics-informed neural networks. J Fluid Mech. 2024;981:R1. doi:10.1017/jfm.2024.49. [Google Scholar] [CrossRef]

17. Zheng J, Li F, Huang H. T-phPINN: physics-informed neural networks for solving 2D non-Fourier heat conduction equations. Int J Heat Mass Transf. 2024;235:126216. doi:10.1016/j.ijheatmasstransfer.2024.126216. [Google Scholar] [CrossRef]

18. Hu R, Lin Q, Raydan A, Tang S. Higher-order error estimates for physics-informed neural networks approximating the primitive equations. Partial Differ Equ Appl. 2023;4(4):34. doi:10.1007/s42985-023-00254-y. [Google Scholar] [CrossRef]

19. Wang Q, Gong M, Matynia A, Zhang L, Qian Y, Dang C. Soot temperature and volume fraction field predictions via line-of-sight soot integral radiation equation informed neural networks in laminar sooting flames. Phys Fluids. 2024;36(12):121703. doi:10.1063/5.0245120. [Google Scholar] [CrossRef]

20. Yuan B, Wang H, Heitor A, Chen X. F-PICNN: a physics-informed convolutional neural network for partial differential equations with space-time domain. J Comput Phys. 2024;515:113284. doi:10.1016/j.jcp.2024.113284. [Google Scholar] [CrossRef]

21. Garnier P, Viquerat J, Rabault J, Larcher A, Kuhnle A, Hachem E. A review on deep reinforcement learning for fluid mechanics. Comput Fluids. 2021;225:104973. doi:10.1016/j.compfluid.2021.104973. [Google Scholar] [CrossRef]

22. Wang Y, Zhong L. NAS-PINN: neural architecture search-guided physics-informed neural network for solving PDEs. J Comput Phys. 2024;496:112603. doi:10.1016/j.jcp.2023.112603. [Google Scholar] [CrossRef]

23. Geneva N, Zabaras N. Modeling the dynamics of PDE systems with physics-constrained deep auto-regressive networks. J Comput Phys. 2020;403:109056. doi:10.1016/j.jcp.2019.109056. [Google Scholar] [CrossRef]

24. Chen Y, Koohy S. GPT-PINN: generative pre-trained physics-informed neural networks toward non-intrusive meta-learning of parametric PDEs. Finite Elem Anal Des. 2024;228:104047. doi:10.1016/j.finel.2023.104047. [Google Scholar] [CrossRef]

25. Jagtap AD, Mao Z, Adams N, Karniadakis GE. Physics-informed neural networks for inverse problems in supersonic flows. J Comput Phys. 2022;466:111402. doi:10.1016/j.jcp.2022.111402. [Google Scholar] [CrossRef]

26. Mao Z, Jagtap AD, Karniadakis GE. Physics-informed neural networks for high-speed flows. Comput Meth Appl Mech Eng. 2020;360:112789. doi:10.1016/j.cma.2019.112789. [Google Scholar] [CrossRef]

27. Wassing S, Langer S, Bekemeyer P. Physics-informed neural networks for parametric compressible Euler equations. Comput Fluids. 2024;270:106164. doi:10.1016/j.compfluid.2023.106164. [Google Scholar] [CrossRef]

28. Guo M, Deng X, Ma Y, Tian Y, Le J, Zhang H. Hypersonic inlet flow field reconstruction dominated by shock wave and boundary layer based on small sample physics-informed neural networks. Aerosp Sci Technol. 2024;150:109205. doi:10.1016/j.ast.2024.109205. [Google Scholar] [CrossRef]

29. Cao W, Song J, Zhang W. Solving high-dimensional parametric engineering problems for inviscid flow around airfoils based on physics-informed neural networks. J Comput Phys. 2024;516:113285. doi:10.1016/j.jcp.2024.113285. [Google Scholar] [CrossRef]

30. Mao Z, Lu L, Marxen O, Zaki TA, Karniadakis GE. DeepM&Mnet for hypersonics: predicting the coupled flow and finite-rate chemistry behind a normal shock using neural-network approximation of operators. J Comput Phys. 2021;447:110698. doi:10.1016/j.jcp.2021.110698. [Google Scholar] [CrossRef]

31. Ren X, Hu P, Su H, Zhang F, Yu H. Physics-informed neural networks for transonic flow around a cylinder with high Reynolds number. Phys Fluids. 2024;36(3):036129. doi:10.1063/5.0200384. [Google Scholar] [CrossRef]


Cite This Article

APA Style
Shen, Y., Dong, B., Ma, Q., Dang, C., Li, C. et al. (2026). Study on Flow and Heat Characteristics of Compressible Gas in a Supersonic Nozzle Based on PINNs with Sparse Data. Frontiers in Heat and Mass Transfer, 24(2), 7. https://doi.org/10.32604/fhmt.2025.077096
Vancouver Style
Shen Y, Dong B, Ma Q, Dang C, Li C, Ren G, et al. Study on Flow and Heat Characteristics of Compressible Gas in a Supersonic Nozzle Based on PINNs with Sparse Data. Front Heat Mass Transf. 2026;24(2):7. https://doi.org/10.32604/fhmt.2025.077096
IEEE Style
Y. Shen et al., “Study on Flow and Heat Characteristics of Compressible Gas in a Supersonic Nozzle Based on PINNs with Sparse Data,” Front. Heat Mass Transf., vol. 24, no. 2, pp. 7, 2026. https://doi.org/10.32604/fhmt.2025.077096


cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 425

    View

  • 75

    Download

  • 0

    Like

Share Link