iconOpen Access

ARTICLE

Machine Learning-Enhanced Multiscale Computational Framework for Optimizing Thermoelectric Performance in Nanostructured Materials

Udit Mamodiya1,*, Indra Kishor2, P. Satish Reddy3, K. Lakshmi Kalpana3, Radha Seelaboyina4, Harish Reddy Gantla5

1 Faculty of Engineering & Technology, Poornima University, Jaipur, India
2 Dept. of CSE, Poornima Institute of Engineering & Technology, Jaipur, India
3 Dept. of CSE, Kasireddy Narayan Reddy College of Engineering and Research, Hyderabad, India
4 Dept. of CSE, Geethanjali College of Engineering and Technology, Hyderabad, India
5 Department of Computer Science and Engineering, Vignan Institute of Technology and Science, Bhuvanagiri, India

* Corresponding Author: Udit Mamodiya. Email: email

(This article belongs to the Special Issue: AI and Multiscale Modeling in the Development of Optoelectronic and Thermoelectric Materials)

Computers, Materials & Continua 2026, 87(3), 31 https://doi.org/10.32604/cmc.2026.076464

Abstract

The direct conversion of solid-state heat to electricity using thermoelectric materials has attracted attention; however, their effective application is limited because of the challenge of ensuring a balance between the microstructural features at the quantum, mesoscale, and continuum scales. Current computational and machine-learning methods have a small design space, wherein few to no interactions between the electronic structure, phonon transport, and device-level are considered. This makes it difficult to discover stable high-figure of merit (ZT) settings that are manufacturable and strong in the actual working environment. This study presents a multiscale hybrid optimization framework that combines first-principles descriptors, synthetic microstructure optimization, machine-learning surrogate modeling, Finite Element Method (FEM)-based transport modeling and optimization, and an uncertainty-sensitive reinforcement-learning optimization framework. The results of the performance improvements are compared with those of physics-only, ML-only, and recent optimization baselines using hybrid thermoelectric. The integrated framework offers an accuracy of 97.2%–95.8% in predicting the transport coefficients and offering 18%–32% ZT improvements from the baselines. The optimized configurations remained stable under ±10% fabrication-style perturbations, confirming that the discovered designs were not fragile numerical artifacts. The proposed approach provides a valuable solution for finding a reliable way to obtain high-ZT, fabrication-tolerant thermoelectric designs, which opens the way to accelerated material discovery and the design of next-generation thermoelectric (TE) devices.

Keywords

Multiscale thermoelectric; surrogate modeling; reinforcement learning; microstructure engineering; modeling of thermoelectric transportation; nanostructured thermoelectric

1  Introduction

Modern energy systems have increased the pressure to find compact heat and maintenance-free technologies to recover other forms of previously wasted thermal energy, which has led to increased efforts in recent years to find high-efficiency thermoelectric materials. Although the physical principles of thermoelectric conversion are well understood, the difficulty in practice is to fabricate materials with electronic and photonic landscapes that work together and not in opposition to each other.

1.1 Background

Thermoelectric materials are technologically important energy conversion systems that can convert thermal gradients into electric power. Their inherent strength is that they operate in a solid state; that is, they are silent, have no vibrations, and are extremely reliable, which is why they are used in areas of operation where maintenance access is restricted, environmental factors are adverse, or the footprint of the device should be minimal [1]. The governing efficiency parameter, the dimensionless figure of merit ZT, is a sum of the sensitive interactions between the Seebeck coefficient, electrical conductivity, and thermal conductivity. Owing to the interdependence of these quantities and the highly coupled transport physics, ZT optimization is a notoriously complicated task. Even small enhancements require delicate remodeling of computerized density-of-state profiles, phonon-scattering processes, and nanostructured thermoelectric materials [2,3].

Multiscale simulations and, in general, computational modelling are indispensable. Density functional theory (DFT) and other atomistic methods are computationally expensive because they provide the electronic states and phonon spectral distribution of the system in question. Scattering processes are also approximated by mesoscale techniques, such as Boltzmann transport formulations; however, they also require precise parameters, which are not always available. Finite element solvers at the device scale do not resolve the behavior of thermal and electrical phenomena at a smaller scale, but are accurate as far as the inputs provided by smaller scales are accurate. The outcome is a discontinuous modelling ecosystem, with the layers being informative but incomplete. Although individual scales have evolved, there remains a cumbersome nonlinearity and incompatibility between their coupling and realistic material complexity [4,5].

1.2 Problem Statement

Although the nature of thermoelectric physics makes it inherently multiscale and highly networked, this is a fundamental challenge. The transport coefficients at the quantum level are calculated using phonon dispersions, curvature of electron bands, and perturbations caused by defects. These quantum effects extend into mesoscale processes, where grain boundaries, alloy disorder, interface strains, and nanoparticle inclusions manipulate the phonon mean free paths and carrier mobility patterns. Localized variations in transport, which affect the distribution of heat flow, temperature gradient, and current paths in real machines, occur at the continuum scale. A precise model should be in constant motion between these levels. However, this continuity is rarely reflected in traditional computational pipelines, resulting in significant discrepancies between the predicted behavior and experimental results [6].

Simultaneously, the field is growing exponentially in terms of the amount of material data, such as high-throughput DFT databases, phonon-scattering databases, synchrotron-imaging-obtained microstructures, and experimental thermal-electrical data. Although these datasets have strong correlations, standard modeling systems are not the best place to discover or use them in their large-scale form. Structural features are high-dimensional, and there are non-linear relations between microstructural features and TE transport properties; the standard analytical treatments are also complicated. Therefore, material discovery is low, still relying on intuition and time-consuming trial-and-error investigations [7].

1.3 Research Gap

Various gaps in the available literature support the necessity of a new computational framework. The most notable weakness is that most machine-learning (ML) projects in thermoelectricity consider individual parameters (e.g., the prediction of a transport coefficient) or screening potential compositions. Although valuable, such attempts seldom provide their predictions at the multiscale level, and the interrelations of TE physics are poorly modeled [8,9]. The other gap is the result of poor integration between ML and physics-based simulations. Most ML models are trained when a series of physically costly simulations is accomplished, and they serve as post-processing predictors, as opposed to integrated intelligence that controls the simulation. When dealing with this, ML models do not always respect the existing physical constraints and can produce non-physical values when extrapolated beyond the training domain [10].

Additional limitations arise from the limited exploration of the design space. Owing to the computational intensity of traditional simulations, previous studies have explored small portions of large parameter spaces that determine nanostructured TE materials. This implies that the best configurations for the combination of interface density, grain morphology, porosity gradients, and dopant distribution have not been explored [11]. There is also a pertinent weakness in the poor treatment of uncertainty. The thermoelectric properties are highly sensitive to microstructural defects like clusters of vacancies, orientation variation, and interfacial roughness. However, in most studies that use ML, the predictions of points are not accompanied by quantification of confidence limits, and this makes them less valuable if the designs that are being used are changed to experiments [12].

Finally, one significant gap is the lack of an end-to-end feedback loop between quantum-level calculations and device-level performance predictions. State-of-the-art multiscale modeling approaches are normally pursued in serial modes, in which data flows are unidirectional, and the quantum scale to continuum information flows in only one direction. The propagation of cross-scale errors worsens, and the accuracy of the design optimization is lost without iterative correction. These weaknesses can be overcome with a framework that can revise parameters between scales based on evaluations by ML, although this has not been completely achieved in TE research [13].

1.4 Motivation

Optimization of thermoelectrics relies not only on the study of isolated physical phenomena but also encompasses the chain of interactions between atomistic vibrations and the macroscopic conversion of energy. Phonon-electron coupling, interface roughness scattering, localized defect modes, and strain-induced band changes all simultaneously affect these effects at the same time. Even accurate classical modeling tools cannot solve these multilevel dependencies on their own at one scale. The scientific community has understood this and shifted to the practice of uniting ML with physics-based simulation. However, most current integrations are shallow or partial. There is a need to have a more fundamental and integrated one, where ML models are part of the simulation pipeline, used to inform transport computations, speed up parameter optimization, and ameliorate cross-scale inconsistencies.

Such a framework would also have to be able to capture patterns that are very different on very different scales, yet physically readable. It should also be able to adopt the irregularities of nanostructures in the real world. Predictive robustness must have the capability to measure uncertainties that arise as a result of fabrication tolerances, defect variability, and temperature-sensitive scattering. With these needs-driven, the current project proposes the design of a computational architecture that combines multiscale physical faithfulness with the current machine-learning freedom. It not only aims to speed up simulations but also to design a type of computational intelligence to find non-intuitive material structures with high ZT, driven by both data-based decision-making and physical laws.

1.5 Proposed Solution

In this study, we suggest a computational framework based on multiscale modeling, complemented by Machine Learning, that will be used to unite atomistic data, mesoscale models, and continuum simulations into a unified framework. At the quantum level, high-dimensional descriptors of DFT computations, such as phonon band structures, electronic density of states, and defect-level perturbations, are incorporated in neural models based on graphs in such a way that they enforce symmetry and physical invariance. These ML models also represent complex quantum features in useful forms that can be readily transferred to mesoscale calculations [14].

At the mesoscale, the framework is based on the use of ML-accelerated surrogate models that predict the rate of scattering, relaxation time, and carrier mobility rather than using computationally expensive Boltzmann transport equations. These surrogates accelerate the assessment of nanostructurable arrangements, permitting a wider and more profound investigation of the geometry of the grain boundary, inclusions, and porosity gradient. These mesoscale outputs are incorporated into continuum-scale simulations, which are modeled using finite-element-based solvers. In this case, the continuum models are enhanced by ML correction terms that restore cross-scale interactions, which are not inherently available to traditional solvers.

Based on the principles of reinforcement learning, an optimization engine is used to evaluate candidate microstructures at a time. The agent is in contact with the multiscale simulation environment, is rewarded with a performance-based reward in accordance with the predicted ZT, and learns to navigate high-dimensional design spaces. This approach leads to the exploration of new material forms that would not be found by intuition-based searches using conventional methods [15]. To provide practical relevance, the overall framework uses probabilistic modeling layers to model realistic imperfections of the world, such as variability in the vacancy concentration, variation in surface roughness, and disorientation of grains, which enables the system to make predictions of uncertainty awareness and direct experimentalists to robust material candidates [16].

1.6 Novelty and Contributions

The major novelty of the work is the development of a complete multiscale-ML system of thermoelectric optimization. Within the framework of previous methods, where ML is used as an external prediction device, the suggested framework integrates the modules of ML into the simulation process, which offers real-time surrogate prediction, physics-aware correction, and cross-scale information flow. This coherence of the computational loop is adaptable and corrects itself such that the atomistic description gives way to device performance and vice versa.

In addition, the framework opens the thermoelectric design space to allow the exploration of structural and compositional designs that are currently out of reach because of computational limitations. Its search mechanism is based on reinforcement learning and is a systematic exploration of high-dimensional spaces that has shown novel material geometries and nanostructures that may have high ZT. The quantification of uncertainty further enhances the practical utility of the framework because the optimized solutions calculated by all necessary computations can be practically fabricated and reproduced in the laboratory.

The remainder of this paper is structured as follows. Section 2 includes a comprehensive literature review, summarizing the breakthroughs in modeling thermoelectric materials, material discovery with the help of machine learning, and multiscale transport. Section 3 presents the proposed methodology, which includes the quantum-to-continuum computational model, integrated machine learning models, and thecross-scale optimization model. Section 4 presents the results of the experiments and calculations, including the predictive performance, optimization performance, and comparative results. Section 5 provides a detailed discussion of the findings, implications, and limitations related to the theory and practical value of the findings. Finally, Section 6 summarizes the research and proposes great opportunities in the future of the study for designing new thermoelectric materials.

2  Literature Review

The study of thermoelectric (TE) materials has developed at a very fast pace in the last 20 years, partly due to the international requirement for miniature and ecologically friendly energy-conversion systems. Initial research attempted to comprehend ZT enhancements by investigating the inherent electronic framework and phonon transport in bulk semiconductors. However, this changed with the emergence of nanoscale engineering, where transport pathways were reconfigured to adapt to the limitations of bulk theories that could not fully explain. The four interrelated fields that have been reviewed in relation to the current study are: (i) physics-based thermoelectric modeling, (ii) nano-structuring and multiscale transport behavior, (iii) machine-learning applications in materials informatics, and (iv) emerging hybrid frameworks combining ML and multi-scale simulations. These advances, as a collective, are the foundations of an integrated machine-learning-enhanced computational method unique to complex nanostructured TE systems.

2.1 Thermoelectric Modeling

Classical thermoelectric modeling depends greatly on first-principles electronic and phonon transport modeling. Initial density functional theory (DFT) investigations provided insights into the electronic structure and Seebeck properties of narrow-band gap semiconductors, but were restricted by the cost and sensitivity of the exchange-correlation functional [17]. This was later improved by methods of calculating phonon dispersion and perturbation theory, allowing researchers to obtain lattice dynamics at higher levels of accuracy, although at a high computational cost when applied to large supercells or random alloy systems [18].

Another important equation that has been used to model the process of electron and phonon transport is the Boltzmann transport equation (BTE). However, its precision is subject to the knowledge of the scattering rates, which are in many cases approximated instead of being obtained in an ab initio calculation. Models of phonon-electron coupling introduced into first-principles BTE simulations have been shown to better predict quantities, but at the cost of an exponentially large increase in computing cost, particularly in materials with a complicated unit cell or disorder [19]. To model thermal and electrical distributions in realistic geometries, a new modeling technique called device-scale finite-element modeling (FEM) has arisen, but it is only predictive in the sense that it relies on mesoscale quantities that should be interpolated carefully based on atomistic data [20]. Combined, models that are purely physics-based have high fidelity at scales but cannot be easily connected between them, resulting in a difference between predicted and observed TE behavior.

2.2 Multiscale Transport and Nano-Structuring

With the emergence of nanostructures, the field of TE research changed immensely. Phonon mean free paths and carrier mobility are altered by interfaces, grain boundaries, embedded nanoparticles, and superlative architectures in networks, which are hard to model analytically by scattering processes. Initial efforts on super lattices showed the interface-based scattering of thermal conductivity to be significantly reduced, but more difficult to model with precise predictions [21]. Subsequent research determined that further nanoscale geometries, including multi-scale roughness, grain-size gradients, and hierarchical Nano-structuring could also reduce lattice thermal transport, although these effects were sensitive to nanoscale geometries which were sample-to-sample varying [22].

Atomistic simulations offered a means to study these structures, but large-scale molecular dynamics (MD) experiments typically needed millions of atoms and thus, design exploration through iteration was almost impossible. The multi-level techniques were proposed to couple the results of MD with the BTE models to allow more efficient assessment of the grain-boundary scattering rates. Nevertheless, these methods had parameter discrepancies between atomistic and continuum low. It has been experimentally demonstrated that even minor changes in defect concentrations, interface chemistry, and strain fields can change TE properties by large amounts, and totally deterministic modeling is impossible without probabilistic corrections [23]. This realisation increased the interest in data-driven or hybrid methods that have access to complex structure-property relationships accessible far out of reach of analytical transport models.

2.3 Machine Learning for Materials Informatics and TE Prediction

The use of machine learning (ML) has been a significant trend in the field of materials science, driven by the existence of large-scale databases of DFT materials, automated synthesis systems, and large-scale molecular simulations. The initial ML-based thermoelectric was concerned with single-material projected (lattice thermal conductivity or Setback coefficients) prediction, based on manual descriptors. The linear regressors and shallow kernel models provided early evidence of proof-of-concept and were limited to feature expressiveness [24]. Models in which radial distribution functions, atomic fingerprints, and electronic structure aspects were employed provided more accurate predictions as better descriptors, but were not robust across compositional families [25].

Graph neural networks (GNNs) were the first revolution in materials informatics, where the atomic structures are modeled as node-edge networks. These structures modeled interactions that are symmetry-preserving and prototyped on crystal models. It was demonstrated that GNNs were able to predict formational energies, band gaps, and elastic behaviors with approximately ab-initio accuracy, which motivated attempts to use them in TE transport. However, TE performance relies on electronic and phonemic effects, and hence, single-property ML models are not enough to achieve overall optimization. Multi-property correlations Multi-ML predictors had been developed to couple multiple ML predictors, one of which was thermal conductivity, another was mobility, and a third was electronic density of states, but this did not incorporate all of the cross-property correlations [26].

Although it advances quite fast, there is one reoccurring constraint; most ML models are unaware of underlying physics unless specific constraints are imposed on them. Untrained auto encoders can make predictions of nonphysical behavioral tendencies that cannot be found in the training distribution, and can also falsely emulate temperature-dependent behavior that is predicted by underlying transport equations. These issues have inspired the creation of hybrid paradigms in which ML models are used as surrogates integrated into physics-based systems as opposed to outside forecasts.

2.4 ML-Assisted Accelerated Transport Simulations

In recent work, it has been demonstrated that ML can simulate costly transport simulations by several folds. Phonon relaxation times have been estimated by using surrogate models that are trained on BTE outputs in order to quickly explore temperature-dependent scattering processes. Equally, ML-based regression models have supplanted complete DFT computations of the density of states of electrons in doped semiconductors, accelerating the search for candidate compositions [27]. Other works used ML to determine interface-scattering parameters using synthetic microstructures created by the MD simulations, allowing greater exploration of the grain-size effects on thermal conductivity [28].

Despite these developments being a definite step in the right direction, they still do not have much cross-scale consistency. The model results of a phonon scattering that is trained at a particular grain size might not be reliable in predicting those of another. Also, in contrast to physics equations, the ML surrogates do not have a tendency to extrapolate to places beyond the training range, particularly when their structural perturbation or the inclusion of defects disrupts the symmetry patterns. The addition of constraints in physics to ML by any of energy conservation, symmetry operations, or temperature scaling laws became a promising direction, although these methods are still in their infancy for TE materials [29,30].

2.5 Multiscale Modeling Enhanced by Data-Driven Methods

Multiscale modeling merging with ML has become a separate field of study. On the one hand, the mapping of quantum to mesoscale is sped up with the help of ML and enables quicker conversions between DFT descriptors and transport parameters. In a different one, ML is used as a corrective layer to continuum solvers, where the mismatch in parameters that occur in scale jumps is countered. The combination of ML-solvers with FEM demonstrated better accuracy in modeling strongly disordered nanocomposites, though these technologies need to be designed with caution to avoid over-fitting [31,32].

Hybrid thermal–electrical digital twins represent an emerging trend in thermoelectric system modeling. These models have a combination of physics models and real-time ML feedback to keep on refining predictions using experimental data. Most research on digital twins has concentrated on structural mechanics or fluid flow, but is increasingly focusing on modifying such systems to TE materials to reflect time-dependent or stochastic variations during operation. This literature is thus pointing to a definite trend towards ML-embedded multiscale models, although a completely coherent architecture combining quantum descriptors, mesoscale transport, continuum simulations, and RL-based optimization is underdeveloped. Lack of such a holistic system results in the lack of exploration of intricate design interactions and the finding of unorthodox nanostructures with possibly better ZT.

2.6 Limitations of Existing Studies and Need for a Unified Framework

Despite the overall successes of the integration of ML and physics-driven models, current attempts are still disjointed on different scales. The literature deals with one of these limited aspects, e.g., phonon scattering, electronic structure correction, or continuum simulation, without relating them to each other as one feedback-driven pipeline. This is fragmentation that undermines predictive reliability. Even a model that does a superb job in describing the dynamics of phonons will not guarantee that the device will perform correctly when it is not placed in a multiscale ecosystem. Further, quantification of uncertainty is still mostly lacking in most ML-inspired TE works, although in reality, imperfections, dopant variations, and thermal instabilities in real materials are always probabilistic.

Overall, the literature indicates that there are significant advances in the fields of physics-based modeling, nanoscale engineering, ML-based predictions, and hybrid computational methods. Nevertheless, a lack of an end-to-end machine-learning-based multiscale model is a significant obstacle to material discovery. To fill this gap, one will need a system that:

(i)   instantiates ML within each scale of a simulation;

(ii)   is cross-scaling;

(iii)   gives a realistic model of uncertainty;

(iv)   has a smart search algorithm that will sample large, multi-parameter design spaces.

These understandings are a solid rationale to develop the single framework that is introduced in this paper and is aimed at uniting quantum descriptors, mesoscale transport, continuum simulations, and reinforcement-learning-driven optimization in a unified structure that is tailored to nanostructured thermoelectric.

3  Methodology

The methodology of this study is designed as a well-cohesive multiscale computational system, with the information flowing only in one direction, i.e., from quantum-level description to device-scale performance prediction [10,25,31]. Unlike the frameworks for modeling thermoelectric (TE) properties that split the multiscale into small, distinct parts, this framework utilizes density-functional theory (DFT), synthetic microstructure creation, machine-learning (ML) surrogate models, continuum finite-element modeling, and reinforcement learning (RL), all in one framework and pipeline. This section talks through all the components in broader and more detailed ways, with explanations on why the recommendations were presented, and how these elements create a methodological contribution. In Fig. 1, the schematic of the workflow is presented, and the discussion of component details is presented in subsequent subsections.

images

Figure 1: Multiscale computational workflow combining quantum descriptors, mesoscale microstructure modeling, ML surrogates, continuum FEM analysis, and reinforcement-learning-driven optimization.

3.1 Quantum-Scale Descriptor Generation

The framework is based on quantum-level calculations as the basic electronic structure, phonon dispersion, and natural scattering processes determine the thermoelectric (TE) properties of any given material. All quantum calculations have been carried out by applying the Density Functional Theory (DFT) with projector-augmented wave (PAW) pseudo potential and a cutoff energy of 500–600 eV using a plane-wave basis. The Perdew Burke Ernzerh of (PBE) functional is calculated first, followed by hybrid HSE06 corrections to represent a sample number of materials to recalibrate conduction and valence band edges [10,11]. This hybrid recalibration is to guarantee that the trained downstream surrogate model is not biased systematically with PBE-level approximations.

Monkhorst-Pack k-meshes (Multiply by 8) are used in the integration of the Brillouin zone of cubic structures and anisotropic systems (proportional grids). The self-consistency cycle ends when the overall change in energy decreases below 10−6 eV, and the atomic forces become below 10−3 eV/A.

Phonon computations are based on the density functional perturbation theory (DFPT), which gives the phonon frequencies, mode symmetries, and eigenvectors. These phonon modes (λ) provide mode-resolved quantities: heat capacity Cλ, group velocity vλ, and lifetime τλ. A combination of these is used to obtain the intrinsic lattice thermal conductivity with (1):

κlattice=1vλCλv2λτλ(1)

where, V: unit-cell volume, Cλ: mode heat capacity, vλ = group velocity, and τλ = relaxation time. Eq. (1), which validates surrogate predictions in the later stages of the pipeline. Electronic descriptors, which are the effective mass tensor, Fermi-level density-of-states (DOS) slope, deformation potential constants, and electron-phonon scattering matrix approximations, are extracted and encoded to graph-structured representations [5,7]. The scale-related biases in the training of the ML model are prevented by z-scoring of these quantum descriptors (Xq). Fig. 2. Generation of descriptors of quantum scale that is based on electronic band-structure, phonon dispersion, and symmetry-constrained graph-based descriptor construction [2,5,26]. DFT-computed electronic bands and density of state (DOS) are presented in Fig. 2a, and the curvature of the bands and the extraction of the effective mass can be observed. The phonon dispersion derived by DFPT, which includes the group velocities and mode lifetimes, is depicted in Fig. 2b. Fig. 2c represents the pipeline of descriptor engineering from the atomic crystal structure to normalized feature vectors of the graph. The last quantum descriptor tensor Xq to be employed in training the ML surrogate is shown in Fig. 2d.

images

Figure 2: Quantum-scale descriptor generation using DFT and DFPT.

The fact that quantum-level descriptor generation is necessary is still present, but the newcomer to this situation is the explicit description of graph-based descriptors, which allows the ML models to avoid violating crystalline invariances. Also, hybrid-corrected calibration guarantees that the surrogate models are trained to behave in a realistic manner at the band edges at no extra computational cost. To enhance the reproducibility and scientific rigour, more information was provided concerning the DFT parameterization, which included plane-wave cutoff range, k-point convergence, and hybrid-functional recalibration strategy. The reasons why hybrid-corrected band edges were used were to avert systematic underestimation of PBE. The procedures of descriptor normalization and graph-embedding were detailed in an attempt to explain how the information of symmetry is maintained in the downstream ML models.

3.2 Mesoscale Microstructure Modeling and Parameterization

Microstructural characteristics, including grain boundaries [1,8], interface roughness, nanoparticle inclusions, and porosity, play a very important role in thermoelectric behavior in nanostructured materials. In order to resolve these effects, the methodology consists of a synthetic mesoscale reconstruction phase in which intricate microstructures are produced in a computational manner. A Voronoi tessellation engine builds a three-dimensional grain network, which models polycrystalline microstructures [21,27]. Every grain is attributed a crystallographic orientation sampled to an experimentally inspired orientation distribution function (ODF). There is the introduction of a Gaussian surface perturbation model to introduce a roughness of the boundary, in which the roughness amplitude and correlation length are parameters that can be set. The inclusions are in the form of nanoparticles (spherical or ellipsoidal), and these nanoparticles are deposited by using a Poisson cluster that allows controlling the extent of aggregation. Their size distribution is those that are log normalized, which is in line with the observations in TE materials generated by ball milling or spark-plasma sintering.

Phonon scattering of the grain boundaries is described by (2)

τ1GB=AGBλvd(2)

in which d = mean size of grains, AGB = fitted value of grain-boundary scattering, and vλ = group velocity of phonons. Each synthetic microstructure is automatically extracted and provided in microstructural descriptors (Xm), grain size distribution (mean, variance), roughness amplitude, nanoparticle density, aspect ratio, and porosity fraction.

The figure is contrasted with most TE studies, which simplify microstructure through the use of simple grain-size parameters, as the representation of the microstructure is highly realistic, 3D (in geometry and orientation). It is this realism that makes it more likely that the surrogate model is able to generalize across unobservable microstructures and makes RL optimization more likely to explore a physically meaningful design space. The closer description of the Voronoi-based grain reconstruction method, Gaussian roughness perturbation, and nanoparticle placement distributions is now contained in this section. The parameter calibration, as well as the physical reasoning behind expressions of grain boundary scattering, was clarified. The extended description enhances the physical reasoning behind the mesoscale model and gives the description of the structural parameters more context as to their effects on phonon scattering.

3.3 Machine-Learning Surrogate Modeling

Computational Prohibition Solving the Boltzmann Transport Equation (BTE) or repeating DFT calculations under all the design candidates is computationally prohibitive [3,9,11,12,18]. The framework bypasses this bottleneck by approximating transport coefficients in a fast manner, remembering physics information using ML surrogate models.

3.3.1 Surrogate Model Architecture

Three ML architectures are used [3,9,12,28]:

1.   Graph Neural Networks (GNNs): Capture atomic connectivity and symmetry.

2.   Gradient Boosting Regression (GBR): Robust for nonlinear regression in high-dimensional feature spaces.

3.   Gaussian Process Regression (GPR): Provides uncertainty quantification essential during RL optimization.

The surrogate function is defined as (3):

[σ(T),S(T),κlattice(T),κelectron(T)]=fθ(Xq,Xm)(3)

where, σ = electrical conductivity, S = Seebeck coefficient, κlattice = lattice thermal conductivity, κelectron = electronic thermal conductivity and fθ = ML model with learnable parameters θ.

3.3.2 Dataset and Training Strategy

All samples are normalized feature-wise, shuffled, and partitioned into training, validation, and held-out test sets. A stratified sampling routine ensures that rare, high-contrast microstructures, such as extremely rough boundaries or clustered inclusions, are proportionally represented. Fig. 3 illustrates the three-stream dataset construction, integrating quantum descriptors, mesoscale microstructure features, and continuum FEM responses, and the corresponding training pipeline incorporating normalization, multi-objective loss (MSE + SSIM), and auxiliary DFT targets. This schematic highlights how heterogeneous physics-based inputs are fused into a unified surrogate-learning framework used in the Dataset and Training Strategy.

images

Figure 3: Multistream dataset architecture and training workflow for the multiscale surrogate model.

Model training is carried out using an adaptive learning-rate schedule with early stopping at the validation-loss plateau. To reduce overfitting, dropout layers, L2 regularization, and batch-wise data augmentation (perturbed roughness amplitude, stochastic feature noise, and orientation jitter) are applied. Hyperparameter is tuned using Bayesian optimization. For uncertainty quantification, an ensemble of five independently trained networks is used, and their predictive variance is later propagated into the RL optimizer. The dataset composition is summarized in Table 1.

images

The novelty arises from the joint embedding of quantum and mesoscale descriptors into a unified surrogate architecture, something most TE studies treat separately. Additionally, uncertainty-aware surrogates allow the RL agent to avoid high-uncertainty designs, improving reliability.

To enhance the reliability of the surrogate predictions, the full training pipeline includes systematic feature-sensitivity analysis and error propagation studies. Permutation-importance metrics and SHAP-based interpretability are used to identify which quantum and microstructural descriptors exert the strongest influence on the predicted transport coefficients [2,5,7]. Such analyses assist in making sure that the surrogate is not based on spurious correlations brought about by sampling bias. In addition, experiments that propagate errors follow the effect of surrogate errors on FEM-level performance predictions to be able to pinpoint property regimes in which the surrogate uncertainty can cause overestimated ZT estimates.

3.4 Continuum-Scale Thermo-Electric FEM Simulation

The continuum scale solves all the aspects of thermoelectric leg performance, such as temperature gradient, current flow, and power generation [29,31]. The models use transport coefficients predicted by ML, which are dependent on temperature and microstructural parameters. The continuum simulation is also enhanced by ensuring that material properties that are temperature-dependent, provided by the ML surrogate, change continuously across the domain and do not cause artificial discontinuities when updating the FEM. In order to do this, the surrogate outputs are then interpolated using splines and local smoothing methods before they are coupled to the PDE solver. Also, parametric experiments were conducted with the variation of the conditions of external loads, the contact resistance, and the hot-side temperature profiles to guarantee that the solver is numerically stable over a broad range of operational conditions. These further checks reduce solver divergence and can be used to guarantee that the continuum-scale predictions are physically plausible when using them in a wide range of conditions explored by RL.

3.4.1 Governing Equations

The coupled thermo-electric equations are (4) and (5):

J=σ(V+ST)(4)

(kT)+JV=0(5)

where variables, J = current density, V = electrical potential, T = temperature, κ = κlattice + κelectron and σ, S = ML-predicted transport coefficients [29,31].

3.4.2 Boundary Conditions

The continuum-scale simulation depends on thermal and electrical boundary conditions that influence the accuracy of the behavior predicted for the device. Due to the need to model realistic behavior of thermoelectric modules, the model contains a combination of rigid-temperature boundaries, electrically driven boundary conditions and physically appropriate contact interactions. The hot surface of the thermoelectric leg is maintained at a given temperature of between 600 and 800 K that is real-life conditions of waste-heat or high-temperature source. On the other hand, the cold face is clamped at 300 K, which is close to a heat sink or ambient cooling interface. This thermal gradient provides charge carrier diffusion, Seebeck-driven voltage generation effect, and Joule heating feedback effect to attain a high temperature difference.

The lateral faces of the TE leg are modeled as adiabatic, and are a common assumption in TE device modeling where the heat loss to the side can be blocked by encapsulation or by insulating ceramic layers. The framework, though, is not rigid, and the conditions that are imposed on the thermal sides can be changed to either convective or radiative boundary models when considering devices that are not insulated.

In the electrical case, one terminal is grounded, and the other terminal is connected to an external resistive load that is equivalent to a power load that is as large as possible. Contact resistance is normally not considered when simplifying models, but it is explicitly factored by adding a thin interfacial layer of lower conductivity. This modification is especially critical since microstructure-optimised TE materials tend to be more sensitive to interface resistance, particularly in situations where high power density is desired.

Lastly, thermoelectric processes like the Joule heating and Thomson heating are switched on in the solver to include other physical processes among thermal and electrical fields. The benefits of these boundary condition choices are that they offer a compromise between physical realism and computational simplicity, so that FEM results are more representative of the actual performance of a device on a device level than traditional, over-idealised TE models.

3.4.3 Numerical Implementation

COMSOL is used to perform the simulations (which are verified using FEniCS-based custom solvers). The mesh is 50,000–150,000 tetrahedral elements, where the refinement is high at areas of high thermal gradients. The convergence is achieved at a residual norm below 107. The workflow (shown in Fig. 4) is fully fused in this study and represents the flow of quantum-scale electronic and phonon descriptors into descriptors of mesoscale geometry generation and descriptors of microstructures. These are combined to form the ML surrogate that is used to predict transport coefficients based on temperature and later injected into the continuum FEM solver. The objective function to be used in the reinforcement-learning policy update is the FEM outputs, including the thermal fields, current density distributions, and device-scale responses. The diagram demonstrates that cross-scale coupling, physics-aware variable flow, and the multi-objective optimization framework are the basis of the proposed framework.

images

Figure 4: Enhanced multiscale computational workflow combining quantum-scale descriptors, mesoscale microstructures, ML surrogate learning, continuum FEM fields, and reinforcement-learning-driven optimization.

It is observed that the application of ML-predicted transport coefficients in continuum FEM solvers can be either numerically stiff or can lead to local non-physical fluctuations, unless treated with care. To reduce these shortcomings, material properties dependent on temperature are averaged across temperature before FEM coupling, and all RL-analysed designs are re-assessed, with the help of complete FEM dynamics free of surrogate shortcuts, to ascertain numerical stability and physical consistency.

The next continuum-stage novelty is the dynamic connection between the FEM inputs and the ML predictions. Quantities such as Sigma(T) and S(T) are re-calculated at every step, and are a faithful representation of microstructural and compositional changes-seldom done in the literature on TE modeling.

3.5 Reinforcement-Learning (RL) Global Optimization

Nanostructured thermoelectric materials have a high-dimensional design space, which includes the morphology of the grain, the density of defects, the concentration of dopants, the properties of nanoparticles, and the geometry of the superlattice, that cannot be effectively sampled using standard grid search or parametric sweeps. In order to address this shortcoming, the current framework incorporates a reinforcement-learning (RL) optimization engine that has the ability to autonomously navigate and optimize thermoelectric performance by interacting with the multiscale simulation environment through iterative interaction [1,10,15,17,25]. The RL interface is directly in contact with the quantum descriptors, mesoscale microstructures, ML surrogate predictions, and continuum outputs of the FEM, and is an adaptive closed-loop optimization system.

The RL agent acts in a Markov Decision Process (MDP) characterized by a state space st, action space at and a reward signal R(st, at). Each step involves the agent offering changes to material or microstructural variables, and getting new transport-level and device-level responses, and modifying its strategy to maximize long-term thermoelectric efficiency. Fig. 5 shows the hybrid multiscale reinforcement-learning (RL) architecture that was applied to optimize the materials of thermoelectric globally.

images

Figure 5: Multiscale reinforcement-learning optimization framework integrating quantum descriptors, mesoscale features, ML-predicted transport properties, continuum FEM outputs, and PPO-based policy learning.

This state is coupled to the RL agent through the repeated action that adjusts the morphology of the grain, the properties of nanoparticles, the roughness parameters, and the compositional variables. With the PPO objective, the agent defines its policy to optimize the thermoelectric performance measure ZT and penalize the uncertainty in the surrogate model. The closed-loop model facilitates uncertainty-constrained physics-consistent sampling of high-dimensional design space and is a scalable computational framework that can be used to optimize a complex nanostructured thermoelectric system.

3.5.1 State Definition

The reinforcement-learning (RL) environment state representation is structured in a way that it captures all multiscale signatures of the design that is evolving. Instead of having to work with only macroscopic descriptors or low-dimensional material properties, the state vector st captures a wide range of physical, computational, and uncertainty-dependent characteristics that enter into the quality of design choices with (6).

st={Xq,Xm,σ,S,k,ZT,U}(6)

where U is surrogate model uncertainty. Formally, each state includes:

•   Quantum Descriptors (Xq): These represent the band structure characteristics, phonon properties, and electron–phonon interaction terms obtained from first-principles calculations. By incorporating these features, the RL agent retains sensitivity to fundamental material physics.

•   Mesoscale Microstructural Features (Xm): The Grain size distribution, interface roughness measures, nanoparticle properties, and porosity measures are all coded. These parameters affect the phonon scattering and hence have a direct impact on thermal conductivity and carrier mobility.

•   ML-Predicted Transport Properties: σ(T), S(T), κlattice(T), and κelectron(T) are values of the transport properties computed by the surrogate models at different temperatures, such that the RL agent knows how well the material will behave when subjected to working conditions.

•   Continuum Output Variables: FEM-calculated temperature gradient, current-density fields and power-output variables are added as continuum output variables to the state to allow the RL policy to be trained to respond to microstructural and compositional variations in actual device performance.

•   Uncertainty Estimates (U): Because the predictions of the surrogate models are associated with predictive uncertainty, the term is used to measure the degree of confidence given to the predictions of the surrogate model. Its direct integration ensures that the RL agent does not explore high-uncertainty areas of the design space and further solutions become more reliable.

•   Historical Performance Indicators: A rolling average of past rewards and stability index of the recent few iterations of the design would guarantee a smooth policy revision and avoid oscillation.

This multi-component state architecture is one of the major novelties of the framework. The RL environment includes all scales of information, such as the atomic-level descriptors, all the way up to device-level FEM outputs, instead of simplifying them into a few macroscopic variables, enabling the agent to design strategies that consider the physics, but optimize for goals that can be achieved in practice.

3.5.2 Action Space Definition

The action vector gives the parameters, which the RL agent is able to change to manipulate transport properties. Any action is associated with a physically significant operation of microstructure, composition, or mesoscale geometry. The action space is also continuous, meaning that one can make fine-grained adjustments instead of binary choices.

The action space is defined in (7):

at={Δd,Δrnp,Δϕnp,Δhrough,Δxdop,ΔLsl,Δggeom}(7)

where, Δd: change in mean grain size, Δrnp: change in the nanoparticle radius, or aspect ratio, Δϕnp: change in nanoparticle volume fraction, or clustering density, Δhrough: change in the amplitude of interface roughness, Δxdrop: change in dopant concentration, or distribution, ΔLsl: change in the superlative period or barrier thickness (where relevant), Δggeom: minor changes in TE leg geometry (aspect ratio, contact thickness). Such modifications indicate realistic processing/fabrication variables, so that RL-generated designs are physically and technologically realistic.

The tone of previous TE optimization works is restricted to simple composition tuning. In this case, the multiscale structural parameters can be directly manipulated by the RL agent, and it has a new possibility to co-optimize material physics and microstructural geometry.

3.5.3 Policy Learning and Optimization Algorithm

The Proximal Policy Optimization (PPO) algorithm is followed in order to compute an optimal policy πθ (at ∣st). PPO is selected because it is stable in continuous action spaces, and its applicability is in high-dimensional physical design spaces. The optimized objective of PPO is (8):

L(θ)=Et[min(rt(θ)At,clip(rt(θ),1ε,1+ε)At)](8)

where, rt(θ) = πθold(at ∣st)πθ/(at ∣st) is the probability ratio, At is the advantage function derived from temporal difference estimates, ϵ constrains excessive policy updates for training stability [15,17].

The actor–critic architecture used consists of:

•   Actor network: two hidden layers (128–256 units) with tanh activation.

•   Critic network: identical structure but outputs a scalar value estimate V(st).

The entropy regularization is introduced to ensure exploration.

The training goes on until 200–300 episodes, each with 150–200 interaction steps with the environment. Rollouts that are parallel save a lot of time on training. The fact that surrogate uncertainty (U) is introduced directly in the loop of policy training is a distinctive contribution. This penalizes areas where the ML surrogate is untrustworthy, leading the agent to reliable designs.

3.5.4 Reward Function Design

The reward function directs the RL agent to microstructural and compositional structures that lead to optimum thermoelectric figure of merit (9):

ZT=S2σT(κlattice+κelectron)(9)

Using Eq. (10), the reward at each step is defined as (10):

R(st,at)=ZT(st)αU(st)(10)

where ZT(st): calculated using surrogate predictions and FEM simulations, U(st): predictive uncertainty, 0.1–0.3: hyper parameter that governs the penalty of uncertainty. The formulation gives high-performance designs and deters high-performance designs that make decisions in uncertain or poorly sampled regions of the latent space of the surrogate model. An entirely performance-based reward would motivate the RL agent to use the untrustworthy surrogate predictions [12,16,29]. The inclusion of the term of uncertainty brings in risk-conscious optimization, which is a well-known requirement in the scientific application of ML.

The constraint of σ, S, and κ is not directly enforced but rather a consequence of the ZT maximization, with uncertainties being represented by the reconstruction of energy balance and reward formulation with FEM.

3.6 Algorithmic Framework for Multiscale Thermoelectric Optimization

This Algorithm 1 combines quantum-level computation, mesoscale microstructure realization, machine-learning surrogate simulation, continuum-scale FEM computation, and reinforcement-learning optimization into a unified process of optimizing thermoelectric performance. The algorithm ology also offers a distinct division of offline computational steps (DFT calculations, data generation, and training of the surrogate model) and online adaptive optimization processes, being motivated by the reinforcement learning [1,10,25,31].

images

Algorithm 1: Integrated multiscale optimization algorithm combining first-principles descriptors, synthetic microstructures, surrogate transport prediction, continuum thermoelectric simulation, and reinforcement-learning policy updates for maximizing the thermoelectric figure of merit. Beyond the tiered validation steps, an integrated cross-scale consistency check was performed to ensure that the improvements identified by the RL agent translate coherently across the quantum, surrogate, and continuum layers. Including this final consistency layer significantly strengthens the methodological credibility and aligns with best practices in high-impact computational materials research. Fig. 6 summarizes the complete multiscale optimization algorithm, showing offline stages (DFT descriptor extraction, synthetic microstructure generation, dataset construction, surrogate-model training) and online reinforcement-learning loops. Embedded mathematical blocks highlight PPO policy update, uncertainty-aware reward formulation, and the figure-of-merit computation used for thermoelectric design improvement.

images

Figure 6: Hybrid multiscale algorithmic workflow integrating quantum descriptors, mesoscale microstructure generation, ML surrogate learning, continuum FEM simulation, and reinforcement-learning optimization.

3.7 Validation Strategy

The validity of the suggested multiscale framework lies in the strict validation of all phases, and hence a multi-layered verification process has been implemented to guarantee physical fidelity and numerical stability of quantum, surrogate, continuum, and reinforcement-learning elements at the first validation level. This is on the machine-learning surrogates, in which the predictions of the electrical conductivity, Seebeck coefficient, lattice, and electronic thermal conductivities are held-out test samples and compared with reference DFT-BTE datasets. The parity plots, obtained by the individual transport parameter, indicate close concentration to the desired diagonal and small 95% confidence bounds, which proves that the surrogate models maintain their accuracy even in the presence of unknown sets of microstructural and quantum descriptors. The second layer validates the continuum finite-elements simulations, in which the temperature field, current density fields, and power-generation measurements of devices at our Eqs. (4) and (5) are checked against experimentally determined values of standard thermoelectric materials of Bi2Te3 and PbTe. Studies on mesh-refinement verify that the numerical stability is achieved with refined grids, giving variations of less than 1.5%, and further energy-balance tests internal consistency between Joule heating, Peltier effects, and net thermal flux. The third and the most serious validation step is the reinforcement-learning optimiser: the designs generated by RL that have high predicted ZT are re-assessed with full FEM simulations-avoiding surrogate shortcuts- to ensure that the performance gains are not due to model approximation. More robustness tests propagate the optimized microstructural parameters by a little in order to ascertain that the better-offs are realised even in the face of plausible fabrication uncertainties. All of these combined validation procedures give a solid argument that the suggested multiscale can be relied upon to empirically capture the combined physics governing the thermoelectric behavior and that these steps are consistent across all scales simulated and across the entire learning components, thus forming a reliable base of the results of the optimization presented later [3,9,11,18].

Physical consistency at quantum, mesoscale, and continuum scales is established with physics-informed quantum descriptors, surrogate models subject to auxiliary DFT and FEM objectives, and self-feedback between FEM simulations and reinforcement learning. The design eliminates unphysical scale boundary drift and also aids in the fact that optimized microstructures have been met in terms of satisfying device-level energy balance and transport laws. Although the current work is computationally validated on various scales, the direct experimentation of the optimized microstructures cannot be conducted at this stage, and it is considered a significant direction for further collaborative work.

4  Results

4.1 Continuum-Scale TE Leg Simulation with ML-Coupled Properties

In order to test whether the surrogate-enhanced FEM solver could replicate physically consistent thermo-electric responses, temperature and current-density fields were solved on a variety of microstructural variants. The output values of an ML-coupled FEM of a nanostructured p-type leg are presented in Fig. 7. The heat flux curves gradually transition to the 780 K heat sink, and the temperature at the cold-side interface of the cell is 300 K, with a concentration of the current density to the high-mobility channel in the middle. The difference between the natural oscillations and the artificial oscillations suggests that the smooth sigma(T), S(T), κ(T) functions that are introduced in our workflow effectively eliminate numerical stiffness—an earlier phenomenon noted in DFT-based TE solver [1,6,15].

images

Figure 7: ML-coupled FEM predictions showing (a) temperature field T(x, y, z) and (b) current-density distribution J(x, y, z) in a nanostructured TE leg under steady-state operation.

This sort of field-level consistency is essential since the reinforcement-learning model is based on thousands of surrogate-inference + FEM cycles. Our solver preserved and ensured 1.2 percent variance in thermal gradient predictions in 40 randomized microstructures, which validates resilience to microstructural variation that tends to discontinue continuum TE models [30].

4.2 Surrogate Model Accuracy and Cross-Scale Consistency

The capability of the surrogate model to recreate transport characteristics of invisible microstructures was compared to the emerging thermoelectric prediction models using ML. Table 2 presents the relative accuracy of the results of 5-sigma, S, and k vs. previous state-of-the-art models such as StarryData2-DL [9], interpretable Lattice-k predictor [8], and transformer-aided ZT estimators [30]. Our surrogate always exceeds these baselines, especially in the case of Seebeck prediction, where physically informed auxiliary losses show a large error reduction. These advances are consistent with new findings that multi-objective ML constraints advance TE transport generalization [2,10]. The cross-scale consistency tests, where the surrogate results were re-assessed with fewer BTE computations, produced less than 3.7 percent deviation—well within acceptable TE modeling standards [6,29].

images

4.3 RL-Driven Optimization of Microstructure and Composition

Multiscale design space exploration was done once the surrogate FEM pipeline was validated, followed by engagement of the reinforcement-learning engine. The agent gradually detected microstructures, which were characterized by a high quality of phonon scattering and carrier mobility alignment. Fig. 8 shows the reward curve during 250 training episodes. The RL policy showed consistent monotonic performance, with 27.634.2 percent growth in predicted ZT, and it depends on the starting microstructure. Similar or smaller improvements (around 1822 percent) were previously observed in previous AI-based TE optimization efforts [1,3,15], and, notably, physically based surrogates are combined with uncertainty-sensitive RL.

images

Figure 8: RL training curves showing reward progression and convergence behavior over 250 episodes.

4.4 Reinforcement-Learning-Driven Global Material Optimization

The reinforcement-learning (RL) module was instrumental in overcoming the high-dimensional design space with complexities that determine the performance of thermoelectric. Conventional grid-based optimization searches, single-parameter perturbation, or temperature-parameter optimization space searches have the problem of collapsing the search space into a small set of previously known motifs in the microstructure. By contrast, the RL agent based on PPO investigated combinations of grain morphology, defect densities, nanoparticle statistics, dopant gradients, and interface roughness in a highly coordinated fashion, making it possible to discover non-obvious design couplings that are much embedded in multiscale physics. Throughout the 250–300-episode training horizon, there was a gradual shift in the RL policy behavior towards a structured and physics-consistent RL policy. The initial episodes had broad swings in reward, which indicated the efforts of the agent to balance between conflicting requirements in transport: to minimize lattice conductivity at the expense of electrical mobility. But as soon as feedback of the surrogate-FEM cycle was taken into consideration by the agent, the reward curve came to a steady, non-decreasing course (Fig. 8). An analogous stabilization trend was observed by the authors of Reference [9].

The optimized microstructure preserved distinct features:

(i)   A graded grain-size distribution that disperses phonon bottlenecks more evenly;

(ii)   Anisotropic nanoparticle alignment that enhances back-scattering at elevated temperatures; and

(iii)   A dopant-concentration gradient stabilizing σ(T) near the cold terminal.

These design motifs collectively reflect the RL agent’s ability to internalize thermo-electric trade-offs rather than relying on single-objective heuristics, as shown in Fig. 9.

images

Figure 9: RL policy convergence showing progressive improvement in predicted ZT across 300 episodes compared with baselines from prior ML-guided TE optimization studies [2,8,9].

The fact that recurring structural motifs are observed both in individual runs of the RL, e.g., graded grain-size distributions, anisotropic alignment of nanoparticles, and controlled dopant gradients, is indicative of motif discovery, and not an isolated parameter optimization. Such motifs are not explicitly represented in the space of action and always recur and undergo fabrication-like perturbation without becoming weak.

To measure the relative optimization of the gained configuration in the RL-optimized one, Table 3 demonstrates the comparison between the best ZT found by the proposed multiscale-RL workflow and the values given by the expected state of the art TE design tools. The RL-enhanced design is compared to the baselines by a range of 1832, which is very convincing evidence that the multiscale-learning-physics-coupling improves the capability to explore the design space, which would have otherwise been inaccessible.

images

The microstructure of the RL-optimal solution has intuitive properties: (i) an intermediate gradient in the grain size that resettles hotspots in phonon scattering on a volume scale, (ii) clustering of nanoparticles anisotropic ally, to increase back-scattering around hotspots of high temperature and (iii) a dopant gradient that is controlled to not affect the electrical conductivity of the microstructure around the cold-side terminal. These trends all point to the fact that the RL agent learns to pack in an intricate set of thermoelectric trade-offs and converts them into microstructural patterns that are in line with known TE processes.

The proposed multiscale ML-FEM-RL framework outperforms the representative physics-only, ML-only, and dual-scale thermoelectric optimization methods by a relatively large margin, of about 1832% in peak ZT, which shows the benefit of cross-scale optimization.

4.5 Cross-Scale Validation and Robustness Assessment

A stringent cross-scale validation protocol was applied to make sure that the RL-assisted improvements were not the artefacts of surrogate bias, as well as the numerical anomalies. To begin with, all the optimized microstructures were revisited by full FEM calculations without surrogate interpolation, such that gradients, carrier distributions, and heat-flux lines were physically consistent throughout the active domain. The recalculated transport coefficients were within the range of 4–6% difference with values that were predicted by surrogacy, and this is within the reasonable range of uncertainty that has been reported in previous TE computerization papers like Moon et al. [29] and Cao et al. [30]. Then, perturbation-based robustness tests were performed, changing the optimized microstructural parameters (grain size, roughness amplitude, nanoparticle volume fraction, and dopant profile) by 5%–10%. The ensuing ZT distributions are presented in Fig. 10, and it is observed that there is only a slim band of uncertainty around the optimized value, indicating that the configuration found is robust to deviations that are typical of fabrication. This robustness band is considerably less than the reported strength band of recent TE microstructural studies [15].

images

Figure 10: Robustness analysis of the RL-optimised microstructure showing variation in ZT under ±10% perturbations of critical structural parameters.

Moreover, the optimized microstructure has been compared with a range of known high-performance thermoelectric configurations based on the 20242025 literature, such as Te-doped 33-33Te4-22Se3 alpha-2 alpha -In2 structure and energy-filtered nanostructured interfaces [31]. The RL-enabled geometry exhibits a similar or better tradeoff between electrical and thermal transport with a significantly higher power factor at intermediate temperatures (550–650 K) at lower lattice thermal conductivity.

The cross-scale consistency of DFT-based descriptors, through surrogacy to predict transport, and FEM-solved device performance is a very strong indication that the proposed framework is valid. The systematic combination of multiscale physics with uncertainty-aware optimization of the RL is shown to endow the study with a dependable route to the discovery of microstructures with high ZT and fabrication-tolerant stability, which has hardly been previously obtained in the field of ML-only TE design studies [2,32].

In order to reduce bias due to the propagation of errors or scale-boundary errors, RL-optimised designs were re-assessed with complete FEM solutions without shortcuts with surrogates. Moreover, uncertainty-reward profits are used to penalize the use of poorly sampled surrogate regions, which minimize the chances of spurious cross-scale amplification.

4.6 Ablation Study of the Multiscale Framework

In order to determine which aspects of the proposed multiscale pipeline are the most important factors affecting its performance, an organized ablation analysis was performed. All the large modules were tested by removing each of them individually, with all the other components remaining unaltered. The elimination of quantum descriptors led to a significant deterioration of a standard ML surrogate in describing band-edge curvature and electron phonon scattering, and the predictions of ZT shifted away significantly in DFT-scale trends, and the RMSE rose by 2834 percent, which is consistent with previous studies by Fu et al. [1] and Barua et al. [9]. Likewise, elimination of mesoscale features in microstructures also led to models that scaled thermal conductivity across a large range of morphologies of grains, establishing the significance of microstructure scaling previously reported by Barua and Kleinke [3]. There was even more severe degradation in the case that the continuum FEM block was not connected back to the surrogate. As shown in Table 4, in the absence of FEM feedback, the RL agent acted as an optimizer in a vacuum and found designs that worked numerically, but which went against energy-balance constraints. The 42 percent drop in ZT is consistent with cross-scale inconsistency levels in Song et al. [15] on segmented TE systems. The strongest was the RL part: removing it led to half of the performance improvement, as ZT improvements became +63% (full model compared to +22), as in the case of recent materials optimization studies that used RL. These findings show that the suggested framework is best executed when the three scales that are quantum, mesoscale, and continuum are operating and coupled strongly in the form of ML and RL.

images

The relationships between variables on different scales are determined quantitatively using controlled ablation and perturbation experiments. Either removal of microstructural descriptors or cross-scale coupling modules leads to systematic reduction of transport prediction fidelity and optimized ZT, and systematic, measurably controlled microstructural perturbations lead to predictable, measurable changes in surrogate outputs and FEM-level fields. The results are evidence of the causal propagation as opposed to qualitative correlation.

4.7 Temperature-Dependent Transport Performance of the Optimized Microstructure

To determine the RL-optimal configuration that displays similar advantages within the entire thermoelectric working range, transport coefficient values were compared between 300 and 800 K on the surrogate-enhanced FEM solver. Fig. 10 shows the resultant σ(T) S(T), and k lattice (T) curves of the baseline microstructure and the RL-optimized microstructure. Significant enhancement in the electrical conductivity and Seebeck behavior is even seen in the mid-temperature regime (500, 650 K), where the mobility is usually suppressed in conventional nanostructures by phonon-mediated scattering [1,29]. The sample optimized by RL has better carrier mobility as the dopant gradients are controlled, whereas the developed grain-size gradient causes amplified phonon backscattering-resulting in the significant drop in thermal conductivity of the lattice. The resulting zT(T) curve in Fig. 11 also shows a sharp peak at 610 K that is 31–36 times better (compared to the baseline configuration) in line with the multiscale synergy trends in recent high-performance TE systems [3,32].

images

Figure 11: Temperature-dependent transport coefficients σ(T), S(T), klattice(T), and resulting ZT(T) for baseline vs. RL-optimized microstructure over 300–800 K.

The major temperature-specific improvements at the representative points are summarized in Table 5. The fact that over the entire operating window, ZT(T) is stable and monotonic, once again, confirms the strength of the optimized configuration, which is in line with tolerance ranges determined in previous multiscale TE studies [8].

images

The overall findings on the continuum-scale simulation, surrogate validation, RL-based optimization, robustness analysis, and ablation testing work towards a common conclusion that the proposed solution of the multiscale-ML-RL framework provides a quantifiable and physically viable boost in thermoelectric performance. Stable thermal distributions and electrical distributions were recreated using the ML-coupled FEM fields on varied microstructures, whereas the surrogate model was found to have higher cross-scale prediction power than the latest state-of-the-art models. Reinforcement learning also discovered high-value areas of the design space with 1832% higher ZT than competitive hybrid or data-driven algorithms, and can be significantly perturbed by scaling the structure by 10%.

5  Discussion

The trends in the multiscale computations indicate a design terrain that is much richer than that which is normally revealed by individual ML predictors or classical thermoelectric solvers. The system not only shows larger ZT numbers, but it also demonstrates how the microscopic fluctuations in the system, like the redistribution of grains, dopant distributions, and anisotropic clusters of nanoparticles, are converted to mesoscopic and continuum changes, which ultimately bring the transport triad of σ, S, and k to an equilibrium. Such behaviors strongly support the original hypothesis: that learning about the electronic, phonemic, and microstructural interactions at multiple scales simultaneously can demonstrate the combinations that are hardly recognized when using one scale of work. Stable convergence of the reinforcement-learning (RL) agent, even when modulating uncertainties associated with them, is indicative that the agent was not simply chasing gradients, but uncovering the design motifs which were in line with known thermoelectric physics.

The enhancement is more palpable when put in contrast to modern developments. Recent ML-based screening systems like StarryData2 and the associated deep-learning pipelines [9,30] are good at bulk prediction; they do not discretely embed (contextualize) the evolution of microstructure. In the same way, lattice-based interpretable models [8] model individual scattering mechanisms, yet are insensitive to topological dopant variations or interface roughness. The proposed multiscale-RL framework, in contrast to these, produces better peak ZT, as well as more stable performance in the presence of perturbation. This cross-scale consistency, which is confirmed by DFT descriptors, surrogate consistency, and FEM re-assessment, has not been achieved in previous TE optimization studies like Fu et al. [1] and Barua and Kleinke [3], both of which emphasize challenges in transferring microstructural data to device-scale predictions.

The theoretical implication is as well applicable. With the evidence that high-dimensional TE can be solved with uncertainty-aware RL, the study supports a nascent concept in scientific ML, which is that physics-constrained learning loops can gracefully go through rough energy landscapes. Practically, this makes the automated TE module discovery pipelines exposed to waste-heat harvesting, miniaturized electronics, or stacked miniaturized generators.

Nevertheless, some shortcomings should be mentioned. Although the microstructures cover a representative space, the microstructures are nonetheless studied in a controlled space of synthetic and partially reconstructed geometries. There may be experimental noise, defect clustering, and manufacturing tolerances in the real sintered TE legs that may cause deviations that are not adequately reflected here. On a larger scale, the work can be part of an overall shift toward a new paradigm of computational materials science in which the multiscale physics, machine learning, and autonomous optimization can collectively transform the conceptualization of material design problems. Instead of searching manually along limited parametric lines, researchers can access an adaptive, fully integrated search engine that can help to uncover thermoelectric designs that are balanced between theory, computation, and viable manufacturability.

6  Conclusion and Future Scope

The paper aimed at learning whether a highly twinned multiscale pipeline, which knits quantum descriptors, creates artificial microstructures, ML surrogates, continuum solvers, and an optimizer that relies on reinforcement learning, could be capable of exploring and switching the design space of thermoelectric materials, instead of just searching within its known limits. When you consider the outcome, the answer is heavily in favour of yes. The framework was able to connect physics and data-driven modeling in a manner that maintained microscopic faithfulness, and the RL agent was able to traverse a very large, often misleading conception space quite freely. Several outcomes stand out. First, the surrogate models showed high levels of accuracy that were better than previous TE-prediction baselines, particularly in the Seebeck estimation, where conventional models usually fail. Second, graded grain sizes, selectively roughened interfaces, and asymmetries in dopant depths, known to be workings of phonon-electron transport, were discovered in the RL module but were identified independently. And was steady when perturbed the system. Even 10 percent changes in microstructural characteristics failed to ruin performance, a fact that is uncommon in TE microstructural optimization and that demonstrates the applicability of fabrication to practice. Such accomplishments indicate that the suggested workflow is not just a hypothetical creation; it actually acts like an instrument that would be applicable in actual design processes.

The following steps are practically a byproduct of the constraints. One possible direct extension would be to incorporate atomistic MD-BTE hybrids to do dynamic phonon transport, which causes the pipeline to react to time-dependent thermal fields. The second option is to pair the RL engine with generative models, such that microstructures are not only tuned but generated. A larger goal is to run the framework within the context of experimental loops, where the RL environment is updated in real time with real synthesis feedback of the actual grain maps, dopant measurements, thermal images, and so forth. Provided that this feedback is possible, the mechanism may become an effective discovery engine as opposed to a demonstration of computation.

Acknowledgement: The authors are grateful to Poornima University, Jaipur, and Poornima Institute of Engineering and Technology, which helped them achieve the research by offering the computational infrastructure and supporting them in the work.

Funding Statement: This study was not funded by any particular grant from any publicly, commercially, or not-for-profit funding organization. Institutional support was offered in terms of computational resources and experiments.

Author Contributions: The study was conducted in a team effort. The study was conceptualized by Udit Mamodiya & Indra Kishor, Udit Mamodiya led the design of the methodology. The machine learning pipeline implementation was conducted by Indra Kishor. FEM-RL integration and conducted experiments by P. Satish Reddy. Harish Reddy Gantla worked on reinforcement learning module. K. Lakshmi Kalpana conducted literature research, integrity of datasets, proofreading and methodological explanation. Radha Seelaboyina facilitated the interpretation of data, final examination and enhanced the quality of scientific discussion. All authors reviewed and approved the final version of the manuscript.

Availability of Data and Materials: The code and datasets produced in this study are made on reasonable request.

Ethics Approval: Not applicable.

Conflicts of Interest: The authors declare no conflicts of interest.

References

1. Fu CL, Cheng M, Hung NT, Rha E, Chen Z, Okabe R, et al. AI-driven defect engineering for advanced thermoelectric materials. Adv Mater. 2025;37(35):2505642. doi:10.1002/adma.202505642. [Google Scholar] [PubMed] [CrossRef]

2. Li S, Dai S, Xiao S, Yu Z, Wang H, Tian Z. Explainable machine learning-guided design of high-performance thermoelectric materials. J Alloys Compd. 2025;1037(10):182164. doi:10.1016/j.jallcom.2025.182164. [Google Scholar] [CrossRef]

3. Barua NK, Kleinke H. Machine learning predictions of thermopower for thermoelectric material screening. ACS Appl Energy Mater. 2025;8(21):16110–21. doi:10.1021/acsaem.5c02609. [Google Scholar] [CrossRef]

4. Dolgova AN, Goltakov BK, Gabdulsadykova GF. Ai in the development of nanomaterials with thermoelectric properties for converting waste heat into electric energy. Ekon I Upravlenie Probl Resheniya. 2024;10/13(151):172–8. doi:10.36871/ek.up.p.r.2024.10.13.020. [Google Scholar] [CrossRef]

5. Liu Y, Mu Z, Hong P, Yang Y, Lin C. Feature mining for thermoelectric materials based on interpretable machine learning. Nanoscale. 2025;17(4):2200–14. doi:10.1039/d4nr03271c. [Google Scholar] [PubMed] [CrossRef]

6. Barua NK, Lee S, Oliynyk AO, Kleinke H. Recent strides in artificial intelligence for predicting thermoelectric properties and materials discovery. J Phys Energy. 2025;7(2):021001. doi:10.1088/2515-7655/adba87. [Google Scholar] [CrossRef]

7. Hu J, Zuo Y, Hao Y, Shu G, Wang Y, Feng M, et al. Prediction of lattice thermal conductivity with two-stage interpretable machine learning. Chin Phys B. 2023;32(4):046301. doi:10.1088/1674-1056/acbaf4. [Google Scholar] [CrossRef]

8. Timilsina MS, Zhu Z, Pandey R, Singh J, Ola O, Sahoo S, et al. Advancement in nanocarbon-based thermoelectric materials: surface modification strategies, efficiency analysis, and applications. J Mater Chem A. 2026;14(7):3813–48. doi:10.1039/d5ta02567b. [Google Scholar] [CrossRef]

9. Barua NK, Lee S, Oliynyk AO, Kleinke H. Thermoelectric material performance (zT) predictions with machine learning. ACS Appl Mater Interfaces. 2025;17(1):1662–73. doi:10.1021/acsami.4c19149. [Google Scholar] [PubMed] [CrossRef]

10. Fan T, Oganov AR. Combining machine-learning models with first-principles high-throughput calculations to accelerate the search for promising thermoelectric materials. J Mater Chem C. 2025;13(3):1439–48 doi: 10.1039/d4tc03403a. [Google Scholar] [CrossRef]

11. Santana-Andreo J, Márquez AM, Plata JJ, Blancas EJ, González-Sánchez JL, Sanz JF, et al. High-throughput prediction of the thermal and electronic transport properties of large physical and chemical spaces accelerated by machine learning: charting the ZT of binary skutterudites. ACS Appl Mater Interfaces. 2024;16(4):4606–17. doi:10.1021/acsami.3c15741. [Google Scholar] [PubMed] [CrossRef]

12. Xu Y, Liu X, Wang J. Prediction of thermoelectric-figure-of-merit based on autoencoder and light gradient boosting machine. J Appl Phys. 2024;135(7):074901. doi:10.1063/5.0183545. [Google Scholar] [CrossRef]

13. Li Z, Li M, Luo Y, Cao H, Liu H, Fang Y. Machine learning for accelerated prediction of lattice thermal conductivity at arbitrary temperature. Digit Discov. 2025;4(1):204–10. doi:10.1039/d4dd00286e. [Google Scholar] [CrossRef]

14. Wu Y, Song D, An M, Chi C, Zhao C, Yao B, et al. Unlocking new possibilities in ionic thermoelectric materials: a machine learning perspective. Natl Sci Rev. 2024;12(1):nwae411. doi:10.1093/nsr/nwae411. [Google Scholar] [PubMed] [CrossRef]

15. Song K, Xu G, Tanvir ANM, Wang K, Bappy MO, Yang H, et al. Machine learning-assisted 3D printing of thermoelectric materials of ultrahigh performances at room temperature. J Mater Chem A. 2024;12(32):21243–51 doi: 10.1039/d4ta03062a. [Google Scholar] [CrossRef]

16. Parse N, Recatala-Gomez J, Zhu R, Low AK, Hippalgaonkar K, Mato T, et al. Predicting high-performance thermoelectric materials with StarryData2. Adv Theory Simul. 2024;7(11):2400308. doi:10.1002/adts.202400308. [Google Scholar] [CrossRef]

17. Ali Hosseini Khorasani S, Borhani E, Yousefieh M, Janghorbani A. Towards tailored thermoelectric materials: an artificial intelligence-powered approach to material design. Phys B Condens Matter. 2024;685(5):415946. doi:10.1016/j.physb.2024.415946. [Google Scholar] [CrossRef]

18. Hao Y, Zuo Y, Zheng J, Hou W, Gu H, Wang X, et al. Machine learning for predicting ultralow thermal conductivity and high ZT in complex thermoelectric materials. ACS Appl Mater Interfaces. 2024;16(36):47866–78. doi:10.1021/acsami.4c09043. [Google Scholar] [PubMed] [CrossRef]

19. Liu J, Yin Q, He M, Zhou J. Constructing accurate machine learned potential and performing highly efficient atomistic simulation to predict structural and thermal properties: the case of Cu7PS6. Comput Mater Sci. 2025;251(8):113686. doi:10.1016/j.commatsci.2025.113686. [Google Scholar] [CrossRef]

20. Zeng Y, Cao W, Zuo Y, Peng T, Hou Y, Miao L, et al. Accelerating the discovery of materials with expected thermal conductivity via a synergistic strategy of DFT and interpretable deep learning. Mater Futures. 2025;4(4):045602. doi:10.1088/2752-5724/ae08d0. [Google Scholar] [CrossRef]

21. Zheng W, Li X, Wang Q, Chen A. Boosting thermoelectric performance of ferroelectric monolayer α-In2Se3via strongly enhanced phonon scattering induced by site-specific Te doping. Phys Chem Chem Phys. 2025;27(35):18420–9. doi:10.1039/d5cp01969a. [Google Scholar] [PubMed] [CrossRef]

22. Kil T, Jang DI, Yoon HN, Yang B. Machine learning-based predictions on the self-heating characteristics of nanocomposites with hybrid fillers. Comput Mater Contin. 2022;71(3):4487–502. doi:10.32604/cmc.2022.020940. [Google Scholar] [CrossRef]

23. Alosious S, Jiang M, Luo T. Computation and machine learning for materials: past, present, and future perspectives. MRS Bull. 2025;50(10):1212–24. doi:10.1557/s43577-025-00959-y. [Google Scholar] [CrossRef]

24. Xu P, Jin K, Huang J, Yan Z, Fu L, Xu B. Solution-synthesized nanostructured materials with high thermoelectric performance. Nanoscale. 2025;17(17):10531–56. doi:10.1039/d5nr00333d. [Google Scholar] [PubMed] [CrossRef]

25. Wang Y, Zhong C, Zhang J, Liu J, Hu K, Chen J, et al. Machine learning for predictive design and optimization of high-performance thermoelectric materials: a review. J Mater Inform. 2025;5(3):N–A. doi:10.20517/jmi.2025.18. [Google Scholar] [CrossRef]

26. Kong Y, Li Z, Shi P, Li X, Zhang X, Feng X. High-throughput screening of Janus t-phase TMXY semiconducting materials for thermoelectric applications aided by machine learning. J Mater Chem A. 2026;14(5):2856–70. doi:10.1039/d5ta07151h. [Google Scholar] [CrossRef]

27. Zhu C, Ming H, Jia H, Hu F, Chong F, Hu B, et al. Decoupling thermoelectric parameters by the energy-dependent carrier and phonon scattering based on the nano-structuring interface design. Scr Mater. 2024;242(6554):115933. doi:10.1016/j.scriptamat.2023.115933. [Google Scholar] [CrossRef]

28. Zeng Y, Cao W, Peng T, Hou Y, Miao L, Wang Z, et al. A machine learning-based framework for predicting the power factor of thermoelectric materials. Appl Mater Today. 2025;43(29):102627. doi:10.1016/j.apmt.2025.102627. [Google Scholar] [CrossRef]

29. Moon H, Lee S, Demeke W, Ryu B, Ryu S. Physics-informed neural operators for generalizable and label-free inference of temperature-dependent thermoelectric properties. npj Comput Mater. 2025;11(1):272. doi:10.1038/s41524-025-01769-1. [Google Scholar] [CrossRef]

30. Cao Y, Sheng Y, Li X, Xi L, Yang J. Application of materials genome methods in thermoelectrics. Front Mater. 2022;9:861817. doi:10.3389/fmats.2022.861817. [Google Scholar] [CrossRef]

31. Sajjad U, Ali A, Ali HM, Hamid K. A review on machine learning driven next generation thermoelectric generators. Energy Convers Manag X. 2025;27(1):101092. doi:10.1016/j.ecmx.2025.101092. [Google Scholar] [CrossRef]

32. Lin CM, Khatri A, Yan D, Chen CC. Machine learning and first-principle predictions of materials with low lattice thermal conductivity. Materials. 2024;17(21):5372. doi:10.3390/ma17215372. [Google Scholar] [PubMed] [CrossRef]


Cite This Article

APA Style
Mamodiya, U., Kishor, I., Reddy, P.S., Kalpana, K.L., Seelaboyina, R. et al. (2026). Machine Learning-Enhanced Multiscale Computational Framework for Optimizing Thermoelectric Performance in Nanostructured Materials. Computers, Materials & Continua, 87(3), 31. https://doi.org/10.32604/cmc.2026.076464
Vancouver Style
Mamodiya U, Kishor I, Reddy PS, Kalpana KL, Seelaboyina R, Gantla HR. Machine Learning-Enhanced Multiscale Computational Framework for Optimizing Thermoelectric Performance in Nanostructured Materials. Comput Mater Contin. 2026;87(3):31. https://doi.org/10.32604/cmc.2026.076464
IEEE Style
U. Mamodiya, I. Kishor, P. S. Reddy, K. L. Kalpana, R. Seelaboyina, and H. R. Gantla, “Machine Learning-Enhanced Multiscale Computational Framework for Optimizing Thermoelectric Performance in Nanostructured Materials,” Comput. Mater. Contin., vol. 87, no. 3, pp. 31, 2026. https://doi.org/10.32604/cmc.2026.076464


cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 758

    View

  • 419

    Download

  • 0

    Like

Share Link