Open Access
ARTICLE
Multi-Granularity Traffic Prediction for Satellite Networks Based on Dynamic Adaptive Graph Modeling
School of Automation, Nanjing University of Science and Technology, Nanjing, China
* Corresponding Author: Li Yang. Email:
Computers, Materials & Continua 2026, 87(3), 82 https://doi.org/10.32604/cmc.2026.077513
Received 10 December 2025; Accepted 28 February 2026; Issue published 09 April 2026
Abstract
Traffic prediction plays a crucial role in the efficient operation of satellite networks. However, due to resource consumption arising from redundant training of multiple individual prediction models, the dynamic and coupled spatial-temporal relationship of traffic, and maintenance of accurate traffic proportions, this problem is non-trivial to solve. Therefore, we consider this problem and makes the following contributions. First, a multi-granularity traffic prediction framework based on a shared feature extraction is designed to jointly predict total network traffic and service-specific traffic of satellite networks. This design ensures that both global and per service predictions benefit from common representations, reduces redundant computations and lowers overall model complexity. Second, a dynamic adaptive graph with Graph Diffusion Convolution (GDC) and Gated Recurrent Units (GRUs) is proposed to extract the spatial-temporal dependency of network traffic by fusing the features of population coverage, satellite distances and historical traffic data. Third, to preserve the proportional relationship of the network traffic, the angle-based loss is employed to minimize the angle deviation between the predicted and truth traffic vectors. Meanwhile, a multi-task loss function is proposed that jointly optimizes the total traffic prediction loss, the service-level losses, and the consistency regularization term to achieve accurate multi-granularity prediction. Numerical results demonstrate that the proposed framework can reduce prediction error and improve correlation for both global and service-level predictions.Keywords
Due to the advantages of global coverage and random access, the satellite network has gained more and more attention in recent years [1,2]. While the traditional terrestrial networks can offer stable and high-speed communication services, they are geographically constrained and incapable of covering remote areas. The satellite network is viewed as a critical supplement and extension to terrestrial communication systems [3,4]. Many companies and countries are making a series of measures and initiatives to develop satellite network. The typical examples include the Starlink LEO broadband constellation operated by SpaceX in the United States [5], the OneWeb constellation governed by Eutelsat in France [6], and the Telesat Lightspeed network managed by Telesat in Canada [7].
In order to provide high quality network service for satellite network, it is necessary to obtain the multi-granularity flow status of the network. The multi-granularity traffic is defined as two levels of traffic. The first granularity is global level traffic which aggregates the total load across all services in network nodes or links. Network traffic reflects the real-time workload variations of nodes and links and serves as a key metric for evaluating network performance. Accurate global traffic prediction enables network operators to perform effective resource allocation and operational scheduling to reduce congestion and minimize resource wastage [8,9]. In addition, it can also assist in detecting abnormal activities within the network [10]. The second granularity is service-level traffic which can be obtained by disaggregating the total network traffic according to different service or application types (e.g., video streaming, IoT telemetry, data transfer). However, coarse-grained global traffic prediction obscures the inter-service resource competition and makes it hard to identify resource bottlenecks for specific services and possible leads to a lack of targeted optimization strategies. Service-level prediction supports fine grained quality of service management, enforcement of service level agreements, and priority scheduling for latency sensitive or bandwidth critical applications. By capturing both coarse-grained and fine grained traffic patterns, multi-granularity prediction allows network operators to perform end to end performance optimization, efficiently virtualize resources across beams, and adapt routing or caching strategies in real time. Therefore, it is of great practical interest to develop methods to predict the multi-granularity traffic in satellite networks.
Satellite network traffic prediction methods have evolved from those of ground networks and can be broadly divided into the following categories. One category is model-driven methods. Traditional time-series models, such as ARIMA, can effectively capture the short-range dependence of network traffic [11]. Han et al. [12] combined wavelet transform with the ARIMA model to address the self-similarity problem caused by multiple traffic sources in satellite communication systems. While the Fractal ARIMA model captures long-range dependence, it has high complexity and large parameters [13]. Due to the limitations of a single model, many scholars advocate for hybrid prediction models. These models decompose network traffic data using techniques such as wavelet algorithms, simplifying the data so it can be adapted to more prediction models, thus reducing complexity and improving prediction accuracy [9]. The model-driven methods are favored for their simplicity and ease of implementation. However, LEO satellite traffic exhibits strong spatial-temporal coupling due to satellite mobility, frequent link switching, time-varying coverage, and abrupt demand changes triggered by user behaviors and unexpected events. These traffic patterns are often nonlinear and non-stationary. Traditional statistical models therefore struggle to capture traffic evolution under rapidly changing network conditions. To better match satellite traffic characteristics, data-driven methods, especially artificial intelligence (AI)-based approaches, have attracted increasing attention. AI-based models can learn nonlinear mappings and extract spatial-temporal dependencies from data. This capability improves prediction accuracy in highly dynamic environments. In addition, satellite traffic prediction often requires information beyond historical traffic, such as inter-satellite distance or topology dynamics and coverage-related indicators. These features differ in modality, scale, and semantic meaning. Conventional handcrafted or linear models usually fail to fuse them effectively. AI-based representation learning maps multi-source heterogeneous features into a shared feature space and learns their correlations. This enables the model to capture the coupled effects among different traffic-related factors more effectively. To enhance the capability of modeling complex traffic pattern, researchers have focused on machine learning methods, such as using Support Vector Machines (SVMs) to build nonlinear traffic prediction models [14]. Deep learning methods have the advantages of powerful feature extraction capabilities and have been widely applied in various fields [15]. Some studies have proposed long short-term memory (LSTM) and Gated Recurrent Unit (GRU) traffic prediction models [16,17]. Attention weights are introduced to balance the impact of each component of the input sequence on the output [18]. Additionally, a multi-dimensional time-series feature neural network has been constructed to capture short-term dependencies using LSTM while reinforcing the focus on long-term dependencies with attention mechanisms [19]. Traffic in networks is influenced not only by temporal factors but also by spatial factors, such as network topology. Recent studies construct graph structures to model the interdependencies among nodes. Graph Neural Networks (GNNs) have advantages in capturing the spatial-temporal relationships of network traffic [17,20]. Some work has also proposed a hybrid prediction model combining Graph Convolutional Networks (GCNs) and GRUs to jointly extract spatial-temporal features of satellite network traffic [17].
Existing research on traffic prediction has predominantly focused on single-granularity prediction at the global node or link level, with relatively little exploration of multi-granularity prediction. However, multi-granularity prediction can enable network operators to simultaneously grasp the overall network load and fine-grained service composition, thereby supporting refined network management. Preliminary studies have emerged in related fields, such as multi-task prediction and multi-service prediction. Nie et al. [14] proposed a multi-task learning framework for single-granularity traffic prediction, where the main task predicts the future traffic matrix and the auxiliary task predicts the current link load. LSTM networks model statistical correlations among traffic matrix elements, and a linear relationship from a fixed routing matrix constructs the auxiliary task’s supervision signal. The model is optimized by jointly minimizing the mean squared error of both tasks. Building upon this work, Wang et al. [21] introduced Convolutional Neural Network (CNN) modules to capture spatial features. Zhang et al. [22] proposed a mobile traffic prediction method based on a Sequence-to-Sequence architecture and ConvLSTM for single-task multi-service traffic prediction. It treats multi-service traffic data as multi-channel sequences, feeds them into an encoder-decoder architecture with ConvLSTM layers, and uses mean squared error for training to realize end-to-end prediction of future multi-service traffic. TransMUSE [23] first clusters edge nodes based on similarity of traffic statistical characteristics and services based on Wasserstein distance. Subsequently, a Transformer-based multi-service traffic prediction method is trained separately for each service cluster. This model captures temporal dependencies in service traffic through a multi-head attention mechanism and does not consider spatial dependencies. However, these preliminary explorations exhibit clear limitations. First, they are often confined to multi-task optimization or multi-service prediction within a single granularity. Second, they lack adaptation to the dynamic spatial-temporal characteristics inherent to satellite networks. Third, they fail to ensure the intrinsic coordination among traffic flows across different granularities. This paper mainly explores a unified framework to achieve multi-granularity traffic prediction that can capture the dynamic spatial-temporal characteristics of satellite networks while maintaining consistency across granularities.
Based on the above discussion, this problem is non-trivial to solve due to the following difficulties.
First, resource consumption arises from redundant training of multiple individual prediction models. The traditional methods primarily perform single-granularity prediction. Independent models must be designed and trained to predict both total network traffic and service-level traffic simultaneously. Training each model brings substantial redundant computational overhead. Moreover, although the objectives of different granularities prediction models are different, they may rely on some common features. These latent representations cannot be shared or reused, which further results in additional waste of data and computational resources.
Second, dynamic and coupled spatial-temporal relationship of traffic. In the spatial dimension, the communication links and network topology between satellites are changing over time. The traffic demand of satellite propagates through inter-satellite links (ISLs) and affects the load distribution of neighboring satellites. This propagation establishes spatial correlations between the satellites. In the temporal dimension, traffic demands exhibit clear periodic fluctuations and trends due to factors such as the periodic movement of satellites in orbit, Earth’s rotation, and user activity patterns. The spatial and temporal correlations are coexisting, coupled and dynamically changing.
Third, maintenance of accurate traffic proportions. Existing traffic prediction methods typically prioritize improving the prediction accuracy of individual traffic elements such as link or node. However, they focus solely on absolute traffic values and fail to preserve the proportional relationships among traffic elements during prediction [24]. As a result, this limitation can lead to suboptimal network optimization, particularly in tasks like traffic load scheduling, bandwidth allocation, and routing decisions. Therefore, the ability to maintain traffic proportional relationships has become increasingly important.
Based on the above discussions, we consider this problem in the paper and make the following contributions.
• A multi-granularity traffic prediction framework based on shared feature extraction for satellite networks is proposed to enable the collaborative prediction of total network traffic and service-specific traffic. This framework ensures that both global and per-service traffic predictions leverage common feature representations, which effectively reduces redundant computations, lowers overall model complexity, and mitigates inefficiencies from independent model training.
• A dynamic graph integrated with Graph Diffusion Convolution (GDC) and Gated Recurrent Units (GRUs) is constructed. It fuses population coverage, satellite distances, and historical traffic data to build adaptive adjacency matrices, thereby extracting the spatial-temporal dependencies of network traffic. This design ensures the validity of spatial-temporal feature representations by integrating multiple features and continuously updating the graph structure.
• To preserve the proportional relationship of network traffic, an angle-based loss function is employed to minimize the angle deviation between predicted and ground-truth traffic vectors. A multi-task loss function is further designed to jointly optimize total traffic prediction loss, service-level traffic losses, and a consistency regularization term, yielding improved accuracy for multi-granularity traffic prediction.
The rest of this paper is organized as follows. The problem formulation is presented in Section 2. Section 3 presents the proposed multi-granularity spatial-temporal model. The numerical results are discussed in Section 4. The conclusion is given in Section 5.
As illustrated in Fig. 1, a satellite network composed of N satellite nodes is considered. Each satellite establishes communication links with ground regions through directional beams and periodically provides data transmission services to ground users. The ground is divided into a regular grid, where each grid cell represents a geographical node containing various communication entities such as base stations, aircraft, ships, and residential areas. Each ground node is associated with a specific level of population density or traffic demand. As a result, the overall traffic distribution is dynamic in both spatial and temporal dimensions. At each time step

Figure 1: System description of the satellite network.
Multi-granularity traffic
The state of inter-satellite communications is captured by the total traffic matrix
where
To capture fine-grained traffic behaviors while preserving a unified network view, the total traffic matrix at time
where
To predict total satellite traffic at the next time step, the historical traffic data should be used. We consider the latest historical data for input which is represented as follows,
where T denotes the time window of the latest data.
Similarly, For the
To perform multi-granularity traffic prediction, the macro-level and micro-level historical sequences are jointly organized as a unified multi-granularity input which is represented as follows,
Dynamic inter-satellite distance
Due to continuous orbital motion, inter-satellite distances vary in real time. These variations directly affect signal quality and link stability. As a result, routing strategies and traffic flows must dynamically adapt to the changing spatial topology [25]. When the distance between two satellites approaches the maximum communication range, available bandwidth can drop sharply or be lost altogether. As a result, traffic is rerouted via alternative paths. In contrast, when inter-satellite distances become shorter, the resulting improvements in link quality and throughput can effectively alleviate existing network congestion. Therefore, distance is considered an important factor affecting satellite traffic. It is further used as an auxiliary input to improve the extraction of spatial-temporal dependencies.
The communication distance
where
To assist in predicting satellite traffic at the next time step, the historical inter-satellite distances can be used. We consider the latest historical inter-satellite distances for input which is represented as follows,
The sequence
Dynamic served population
The continuous movement of satellites causes their coverage areas on the ground to shift over time. This results in dynamic changes in both the number of served users and the corresponding traffic load [26]. Traffic demand in satellite networks is closely tied to the size of the covered population. As illustrated in Fig. 1, Sat 1 and Sat 3 cover the same geographical region at different time, which results in a strong correlation between their traffic demands. In particular, when multiple satellites simultaneously or sequentially serve overlapping or adjacent areas, their traffic loads tend to exhibit significant correlation. In regions with high population density, traffic demand across multiple satellites often varies synchronously and tends to exhibit similar fluctuations. The dynamic variation of served population information supports the modeling of spatial-temporal traffic correlations and contributes to improving prediction accuracy. Therefore, population density maps are integrated with satellite beam footprints to derive a dynamic served population representation, which is incorporated as auxiliary input to the prediction model.
The satellite coverage region can be approximated as the ground projection of its antenna beam [27]. Assuming a circular beam footprint, the coverage area
where
where
By aggregating the estimates across all satellites, the served population state at time
To assist in predicting satellite traffic at the next time step, the historical served population information can be utilized. We consider the most recent sequence of served population vectors as input, which is represented as follows
The sequence
Based on the above input data, the prediction problem can be represented as follows
where
3 Multi-Granularity Traffic Prediction Model Based on Dynamic Adaptive Graph Modeling
To achieve multi-granularity satellite traffic prediction, a graph-based model architecture is proposed, which consists of a shared spatial-temporal feature extraction module and task-specific prediction branches. Both global traffic and service-level traffic are derived from the same underlying network topology and population dynamics, therefore a shared feature extractor enables consistent modeling of spatial dependencies and temporal evolution across tasks. This design reduces parameter redundancy and enhances the generalizability of the learned representations. In addition, global and service-level traffic exhibit distinct spatial-temporal characteristics and modeling objectives. Employing a single prediction head for all tasks compromises the model’s capacity to capture fine-grained traffic variations, which may result in suboptimal performance. Hence, the model adopts a modular architecture comprising one macro branch and multiple micro branches.
The overall model parameters
where
where
The proposed multi-granularity traffic prediction model is shown in Fig. 2 which consists of five steps. First, the node feature projection module processes the original multi-source heterogeneous inputs, including traffic, population distribution, and inter-satellite distance. Each modality is transformed by a dedicated weight matrix and nonlinearly projected into a shared latent space to obtain node-level representations from dual perspectives. Second, the node feature embeddings module integrates the projected features through element-wise operations. This produces unified node embeddings that serve as the basis for graph construction. Third, the adjacency matrices construction module computes similarity between node embeddings. It dynamically generates multiple adjacency matrices that reflect spatial relationships from different views and form the dynamic graph structure. Fourth, the spatial correlation modeling module applies graph diffusion convolution. It captures spatial dependencies and aggregates multi-hop neighborhood information to enhance representation capability. Fifth, the temporal correlation modeling module uses gated recurrent units (GRUs) to perform sequence modeling. It fuses the current node input with historical hidden states to extract temporal dynamics. Finally, the model predicts both global traffic and service-level traffic through one macro branch and multiple micro branches. A consistency loss is introduced to enforce alignment between macro and micro predictions, which improves both accuracy and generalization. The details of the model are further introduced below.

Figure 2: The architecture of the multi-granularity traffic prediction model.
3.1 Node Feature Projection and Embedding
Before training and testing the model, all input features are normalized to ensure numerical stability and consistency across different modalities. This preprocessing step allows the model to focus on relative variations rather than absolute magnitudes. Prophet [24] normalizes each traffic matrix at time
and define
where
with
which represents the proportion of total traffic on each link attributed to service
Link traffic variations in a satellite network are driven not only by the current traffic distribution but also by the constellation’s geometric structure and the ground user density. A single feature mapping cannot fully capture the complex interactions among these three modalities. Hence, the node feature projection module processes the original multi-source heterogeneous inputs, including traffic, population distribution, and inter-satellite distance. Each modality is transformed by a dedicated weight matrix and nonlinearly projected into a shared latent space to obtain node-level representations from dual perspectives.
Traffic projection
To further boost expressive capacity, dual-head parallel projections are introduced during traffic mapping: two channels share the same input but employ independent learnable linear mappings to encode both the global relative-load matrix
where
Population projection
To incorporate temporal context into static population information, the normalized population vector
where
Distance projection
Similarly, the normalized inter-node distance matrix
where
The learnable parameters for the node feature embedding module are:
The node feature embeddings module integrates the projected features from different modalities through element-wise operations. This yields unified node representations that serve as the foundation for graph construction. Specifically, each traffic projection is fused with its corresponding distance and population projections via element-wise multiplication
After dual-head fusion, the final node embeddings are generated by vertically stacking the outputs from the two heads, resulting in a unified representation for each node that integrates complementary directional and modal information. Specifically, the global traffic and service-specific embeddings are constructed as follows:
These embeddings represent two complementary views of each node at time
3.2 Dynamic Adaptive Adjacency Matrix Construction
Based on the fused node feature embeddings, the adjacency matrices construction module computes pairwise similarities to capture spatial relationships between nodes. Specifically, a pair of direction-aware similarity matrices is constructed by calculating the inner-product differences between dual-head embeddings. This design leverages the asymmetry and complementarity of dual views to model heterogeneous spatial dependencies more effectively. The choice of inner-product differences over widely used alternatives (i.e., cosine similarity and attention-based methods) is motivated by the unique characteristics of satellite network traffic prediction, as elaborated below.
First, cosine similarity is not adopted because it inherently yields a symmetric matrix, which incorrectly treats asymmetric node interactions as having equal strength and thus masks the unidirectional transmission characteristic of satellite network traffic. In satellite network link traffic prediction, spatial interactions between nodes exhibit strict directionality and asymmetry—for instance, the data transfer intensity from satellite i to satellite j may be much higher than that in the reverse direction (satellite j to satellite i). Such symmetry would mislead the subsequent graph convolution module into failing to capture the true spatial dependencies. In contrast, inner-product differences create an antisymmetric matrix that quantifies the difference in interaction strength between the two directions in the satellite network.
Second, attention-based methods are not selected despite their adaptability, as they typically introduce additional projections and higher computational overhead. Attention mechanisms rely on learnable query/key projection matrices to implicitly model node dependencies, and these extra projection parameters increase the model’s computational complexity and parameter burden. Since the dynamic graph needs to be updated at each time step, computing attention weights across all node pairs at each step would elevate the cost of dynamic graph construction.
The resulting dynamic adjacency matrices reflect both the similarity and interaction strength between nodes. They enable bidirectional and multi-modal spatial modeling in subsequent graph convolution stages, serving as the structural backbone for dynamic graph construction.
Specifically, for the global traffic embeddings, the head-wise similarity matrix is defined as:
For the
To preserve the directional distinction and filter out negative interactions, a nonlinear activation function
For the
where
3.3 Spatial-Temporal Correlation Modeling
Satellite network traffic exhibits significant spatial-temporal dependencies. The approach that combines Graph Diffusion Convolution (GDC) with Gated Recurrent Units (GRU) can effectively capture the spatial-temporal features of the input sequences [28]. GDC aggregates multi-hop neighborhood information to form spatial representations that reflect dynamic topology. The spatial features are subsequently processed by GRU units to model both short-term and long-term temporal dynamics within a unified framework. Based on this method, we propose a dynamic adaptive graph model that fuses population coverage, inter-satellite distances, and historical traffic data into the adjacency structure. The constructed dynamic graphs are fed into both the GDC and GRU components to extract robust spatial-temporal representations.
Given the input node feature matrix
where C is the number of adjacency matrix channels and L is the diffusion order.
To incorporate node-specific adaptive transformations, element-wise modulation is applied to the weight pool.
where
The full parameter set of the spatial convolution module is
Given input
where
Both the total-traffic and per-service features are fed into GRUs to generate their respective hidden representations:
These outputs serve as spatial-temporal node embeddings, providing dynamic, service-aware representations for prediction tasks. Each GRU gate is associated with an independent GDC module. Therefore, the parameter set of the GRU-based spatial-temporal modeling is
where
The set of shared model parameters is defined as
which is jointly optimized across both macro-level and micro-level prediction branches to enable consistent spatial-temporal representation learning.
3.4 Multi-Granularity Prediction with Consistency Regularization
The proposed macro-micro dual-branch architecture is designed to balance the trade-off between learning common spatial-temporal dynamics and preserving service-specific characteristics. A key challenge in multi-granularity prediction is the potential conflict between feature patterns of different service types. While services exhibit unique characteristics (e.g., burstiness of video, periodicity of IoT), they are fundamentally governed by common underlying dynamics, such as satellite mobility, topology evolution, and regional user distribution. These common dynamics are captured by a shared spatial-temporal feature backbone. This shared module learns universal representations and prevents redundant modeling of these common factors. Service-specific variations are then isolated and refined by lightweight, independent micro-branch heads. It prevents service-unique signals from interfering with the learning of universal spatial-temporal patterns, thereby striking a balance between efficient representation learning and the preservation of service characteristics. Based on the shared spatial-temporal feature extraction, the model generates macro-level predictions for global traffic and micro-level predictions for each service traffic. The total traffic at time
For each service type
The parameters of macro prediction branch are defined as
Similarly, the parameters of micro prediction branch are defined as
To capture proportional variations and directional trends in satellite traffic data, an angle loss function [24] is employed as the prediction metric. Given two flow vectors
The loss emphasizes the relative orientation between the predicted and true traffic vectors. By ignoring absolute magnitudes, it enables the model to capture proportional variations and directional trends in traffic patterns. The model is trained by minimizing a composite loss function that includes three components: the global traffic prediction loss, the service traffic prediction loss and a consistency regularization term. The overall loss is defined as
where
The first term
The second term
and the total service loss is computed by
The third term
This term penalizes deviations between the aggregated micro predictions and the total traffic ground truth. It encourages structural alignment between the two prediction levels and helps improve overall prediction accuracy. The entire training process involves sequentially processing historical time steps to extract spatial-temporal representations, generating macro and micro predictions, and optimizing the model parameters via gradient-based learning. At each step, multi-modal features are projected and embedded, similarity-based dynamic graphs are constructed, and spatial-temporal representations are updated recursively. The final predictions are compared with ground-truth data using the composite loss function described above, and the parameters are updated accordingly. The detailed procedure is summarized in Algorithm 1.

3.5 Computational Complexity Analysis
For the single-task single-time-step case, the model’s computational complexity primarily arises from graph convolution operations, Gated Recurrent Unit (GRU) operations, and the final traffic prediction step. Specifically, the complexity of the graph convolution is
Based on the proposed multi-granularity model, we assess its performance in a scenario comprising 66 satellite nodes deployed in low Earth orbit. Each satellite has four neighboring nodes. The simulation duration is set to one month. The detailed hyperparameter settings for model training are presented in Table 1. The proposed traffic prediction model supports two deployment options. First, the model can be deployed on ground stations. In this configuration, satellite traffic data is transmitted via satellite-ground communication links for predictions. Second, the model can be deployed on satellites using an offline training and online inference paradigm. Considering the limited on-board resources of satellites, it may be challenging to implement traffic prediction directly on orbit. We conducted research on the current satellite computing performance. For example, each satellite in China’s Three-Body Computing Constellation has an onboard computing capability of up to 744 TOPS, and the constellation targets a total computing power of 1000 POPS, demonstrating significant on-orbit AI compute potential [30]. Additionally, Starcloud-1, launched on 02 November 2025, is an experimental orbiting computation satellite developed by Starcloud, Inc. It carries an NVIDIA H100 GPU and represents a significant advancement in space-based compute capability compared to previous in-orbit systems [31]. The proposed model requires 28.25 MFLOPs for one forward pass of inference. Hence, as the computing power of satellites continues to improve, the proposed algorithm is technically feasible for on-orbit implementation and operation.

The proposed traffic prediction model has a single inference time of only 37.439 ms, which allows it to predict the traffic state for the next time slot. The existing routing strategies are typically updated based on fixed time slots, which means that route decisions are refreshed at regular intervals [32,33]. Since the predicted time slot aligns with the routing update time slot and the model’s inference time is short, this traffic prediction capability matches the time requirements for routing updates.
Due to the lack of publicly available real-world satellite traffic datasets, traffic generation is simulated based on representative characteristics of service distribution. The geographic partitioning strategy in [34] divides the Earth’s surface into
where
The set of ground stations covered by satellite i at time t is denoted as
To support multi-hop routing in the satellite network, a gravity-inspired traffic splitting mechanism is applied. The set
where
This modeling framework incorporates population distribution, temporal variability, spatial topology, and coverage constraints, and results in a synthetic satellite network dataset.
To provide a comprehensive view of the generated dataset’s characteristics, Table 2 presents summary statistics for both total traffic and individual service traffic after normalization. Statistics metrics include the minimum (min), maximum (max), mean, standard deviation (std), 95th percentile (p95), and 99th percentile (p99) for each traffic type. These statistics reveal several key properties that define the inherent challenge of multi-granularity prediction. First, a significant scale difference exists between total and service-level traffic due to distinct normalization strategies. During preprocessing, total traffic undergoes max-normalization, where each element is divided by the maximum traffic value across the entire network at that time step. In contrast, service-level traffic undergoes ratio-normalization, where each service’s traffic is divided by the sum of all service traffic on its corresponding link. Consequently, service-level traffic exhibits mean values 3–5 times larger than total traffic. Furthermore, service-level data shows greater intrinsic volatility, with standard deviations 1.6–2.5 times higher than those of total traffic. Second, the statistics highlight the heterogeneous characteristics inherent in each service’s traffic pattern. As shown in Table 2, Service 2 exhibits the most challenging statistical profile. It has the highest mean (0.1412), the highest standard deviation (0.2769), and the most concentrated extreme-value distribution. These characteristics indicate that Service 2 has the most unstable traffic pattern, which makes it inherently the most difficult service to predict. In contrast, Services 1 and 3 show more moderate and similar statistical profiles. Together, these properties establish the fundamental complexity of jointly predicting total network load and fine-grained service-level traffic, which motivates the design of a multi-granularity modeling framework.

To comprehensively assess model performance, we employ both error-based and correlation-based metrics: MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) are used to measure the numerical deviation between predictions and ground truths. These metrics quantify the absolute and squared errors, respectively, and smaller values indicate better performance. CORR (Pearson Correlation Coefficient) and COS (Cosine Similarity) are used to evaluate the consistency of variation trends between predicted and true traffic patterns. Higher values of these metrics suggest stronger correlation and alignment in the overall distribution.
4.4.1 Performance Evaluation of the Proposed Model
To evaluate the effectiveness of the proposed model in capturing the spatial-temporal correlations of traffic flow, we conducted comparative experiments against several representative baseline methods, including conventional sequence modeling approaches (e.g., LSTM [16] and GRU [36]), a convolution-based model (TCN [37]), and a graph-based spatial-temporal model (STGCN) [38]. To ensure a fair comparison, all methods were implemented under a unified multi-task prediction framework and trained using our proposed multi-granularity loss function, which enables consistent optimization objectives across different models. As shown in Table 3, the proposed model consistently outperforms all baselines across all evaluation metrics. The proposed model constructs dynamic and adaptive adjacency matrices from multi-source features and uses a parameter-sharing mechanism to extract spatial-temporal features. In particular, our model achieves a cosine similarity of 0.739 for total traffic prediction, which is significantly higher than those of LSTM, GRU, and STGCN. This result demonstrates its superior capability in modeling traffic distribution proportions. LSTM and GRU capture only temporal sequences and ignore spatial connectivity. Their total traffic MAEs exceed 1.0 and RMSEs exceed 2.0. The COS of LSTM is 0.02 lower than that of our model, and the COS of GRU is 0.05 lower than that of our model. These results demonstrate that relying solely on temporal dependencies is insufficient, and that integrating spatial connectivity is essential for accurate traffic prediction. TCN enhances temporal feature extraction but cannot leverage dynamic graph structures. It yields MAEs as high as 5.501 and RMSEs of 13.091. The COS of TCN is 0.13 lower than that of our model. STGCN introduces graph convolutions but relies on a static adjacency matrix and achieves only a CORR of 0.554 and a COS of 0.584. In summary, the experimental results demonstrate the effectiveness of the proposed model in learning complex spatial-temporal dependencies and enhancing multi-task traffic prediction performance.

4.4.2 Performance Evaluation of Different Features
This experiment evaluates the performance of various models for traffic prediction in a satellite network, focusing on the impact of population and distance features on the prediction accuracy. The following models are compared:
• TP-T: the traffic prediction model using only traffic data for prediction.
• TP-T-P: the traffic prediction model incorporating traffic data and population features.
• TP-T-D: the traffic prediction model incorporating traffic data and distance features.
• TP-T-P-D: the traffic prediction model using traffic data, population features, and distance features.
The results reveal that integrating population and distance features improves prediction accuracy, especially for tasks with more pronounced spatial-temporal dependencies. From Fig. 3, it is evident that TP-T-P-D consistently outperforms the other models across all tasks, with significant improvements in both CORR and COS metrics. Specifically, the inclusion of population data helps account for regional user distribution, while distance features provide critical geographical context, both of which lead to more accurate predictions. These findings highlight the critical role that population and distance features play in capturing the spatial-temporal dependencies of traffic patterns, especially in satellite networks where both geographical and demographic factors are crucial for accurate forecasting.

Figure 3: The effect of population and distance features in traffic prediction.
4.4.3 Performance Evaluation of Multi-Granularity Models
Comparative experiments between single-granularity and multi-granularity models are conducted to assess the effectiveness of multi-granularity joint optimization in integrated satellite network traffic prediction. The single-granularity model is optimized solely based on the prediction loss for one service traffic type, whereas the multi-granularity model further integrates multiple task-specific losses and a consistency regularization term to capture both global traffic trends and fine-grained service characteristics. During testing, the cosine similarity between predicted and true vectors is calculated and corresponding histograms and kernel density estimation (KDE) curves are generated. Fig. 4 presents the distributions of cosine similarity (shown as histograms) and the corresponding kernel density estimation (KDE) curves on testset samples for total traffic (a) and for individual services (b–d). In Fig. 4, the horizontal axis denotes the cosine similarity between predicted and true traffic vectors, and higher values indicate better alignment. The vertical axis represents the sample density. Superior models exhibit distributions shifted toward higher cosine similarity and reduced density in the lower-similarity region. Compared with the single-granularity model, the multi-granularity model’s distribution is notably shifted toward higher similarity values and shows reduced density in the lower-similarity region. This shift reflects its ability to produce accurate traffic predictions. This enhancement is attributed to the multi-granularity framework’s ability to model macro-level temporal dependencies in its main branch, leverage auxiliary task branches for service-level details, and align macro-level and micro-level features via consistency loss, thereby enabling more comprehensive spatial-temporal correlation modeling.

Figure 4: Comparison of cosine similarity distributions and KDE curves between single-granularity and multi-granularity models.
4.4.4 Generalization Verification of the Proposed Model
To evaluate the generalization capability of the proposed model under various satellite network scenarios, experiments are conducted on two key parameters: constellation size (with

Figure 5: Model performance comparison under different scenario parameters. (a) the CORR and COS values with different constellation sizes (
To validate the model’s adaptability to data uncertainty in real-world scenarios, Gaussian noise was applied to the population coverage features at varying levels from 0% to 30%. As illustrated in the Fig. 6, the model demonstrates robustness in prediction performance. The CORR consistently remains within a high range of 0.843–0.879, while COS stays stable between 0.896 and 0.919. Notably, at a moderate noise level of 15%, both CORR and COS reach their peak values of 0.879 and 0.919, respectively. This suggests that an appropriate level of noise may enhance the model’s generalization ability through implicit regularization. Even when noise increases to 30%, performance shows no significant degradation, with CORR at 0.843 and COS at 0.897. This indicates that the model’s decision-making does not overly rely on the precise values of specific features but instead captures more fundamental traffic evolution patterns through the shared spatial-temporal feature extraction and graph diffusion mechanisms. This confirms that the proposed framework can effectively handle inaccuracies in population data estimation in real satellite networks and demonstrates engineering practicality.

Figure 6: Performance under different noise levels.
4.4.5 Sensitivity Analysis of Hyperparameters
To quantify the role of the consistency loss in balancing flow conservation and prediction accuracy, we fix all other hyperparameters and evaluate

Figure 7: The effect of consistency-loss weight
In addition to the analysis of the consistency loss weight

Figure 8: Comparison of model performance with different depths and heads. (a) Performance with different network depths; (b) Performance with different number of heads.
4.4.6 Evaluation of Multi Service Traffic Composition Prediction
To evaluate the model’s ability to accurately estimate the distribution of different service types on each communication link, the predicted and ground-truth proportions for three services are visualized and compared. Ground Truth denotes the actual traffic distribution of Service 1, Service 2, and Service 3 on each link. Multi-Granularity denotes the multi-granularity model proposed in this paper. It employs a macro branch to predict total traffic and three micro branches to predict each service independently. The model is trained using a consistency loss. Single-Granularity denotes the model that uses the same spatio-temporal feature extractor but employs a single prediction branch to estimate total traffic or service-level traffic. The outputs are concatenated without any joint cross-granularity optimization. Fig. 9 shows the distribution of three service types on two communication links, where the y-axis represents each service’s proportion as the height of stacked bars. On both links, the Multi-Granularity bars virtually overlap the Ground Truth. Both the total bar heights and the individual service segment proportions align precisely with the actual distributions. This shows that the model captures global trends and service-level details simultaneously. In contrast, the Single-Granularity bars deviate markedly. In Fig. 9a, the model predicts only Service 2, with Services 1 and 3 almost entirely omitted. In Fig. 9b, Service 1 is roughly estimated but another service’s share is greatly overestimated. These results confirm that simply concatenating separate service predictors fails to integrate macro-level and micro-level information and leads to the omission of low-volume services. The multi-granularity model integrates macro and micro branches under consistency constraints to accurately reproduce each service’s true proportions across diverse scenarios.

Figure 9: Prediction of service composition over network links.
In this paper, a multi-granularity traffic prediction framework for satellite networks based on a shared feature extraction is proposed to benefit from common representations and reduce redundant model parameters. Population coverage, inter satellite distance, and historical traffic data are integrated to capture spatial-temporal dependencies. A multi-task loss function is proposed to jointly optimize predictions of total network traffic and service-level traffic in satellite networks. Experimental results demonstrate that integrating population and distance features improves prediction accuracy. The proposed method outperforms baseline methods and presents its effectiveness in capturing the characteristics of satellite network traffic.
Acknowledgement: The authors would like to thank Nanjing University of Science and Technology.
Funding Statement: This paper was supported in part by National Natural Science Foundation of China under grants U21B2003 and 62103191, and in part by the Fundamental Research Funds for the Central Universities under grant 30924010928.
Author Contributions: Xu Chen and Guohao Qiu: Investigation, Data Curation, Writing—Original Draft, Review and Editing, Visualization. Li Yang: Writing—Review and Editing, Supervision, Funding Acquisition. All authors reviewed and approved the final version of the manuscript.
Availability of Data and Materials: The source code for the proposed method is publicly available on GitHub: https://github.com/cx5055/mg-trafficprediction.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Kato N, Fadlullah ZM, Tang F, Mao B, Tani S, Okamura A, et al. Optimizing space-air-ground integrated networks by artificial intelligence. IEEE Wirel Commun. 2019;26(4):140–7. doi:10.1109/mwc.2018.1800365. [Google Scholar] [CrossRef]
2. Liu Z, Gui Y, Wang L, Jiang Y. Offload strategy for edge computing in satellite networks based on software defined network. Comput Mater Contin. 2025;82(1):863–79. doi:10.32604/cmc.2024.057353. [Google Scholar] [CrossRef]
3. Zhou D, He Y, Sheng M, Fu S, Li J, Han Z. Dual-scale traffic management for differentiated services in satellite mega constellations. IEEE Internet Things J. 2026;13(1):228–42. doi:10.1109/jiot.2025.3601348. [Google Scholar] [CrossRef]
4. Lyu Y, Hu H, Fan R, Liu Z, An J, Mao S. Dynamic routing for integrated satellite-terrestrial networks: a constrained multi-agent reinforcement learning approach. IEEE J Sel Area Commun. 2024;42(5):1204–18. doi:10.1109/jsac.2024.3365869. [Google Scholar] [CrossRef]
5. SpaceX. Starlink. 2025 [cited 2026 Feb 15]. Available from: https://www.starlink.com/. [Google Scholar]
6. Eutelsat. Oneweb. 2025 [cited 2026 Feb 15]. Available from: https://oneweb.net/. [Google Scholar]
7. Telesat. Telesat. 2025 [cited 2026 Feb 15]. Available from: https://www.telesat.com/. [Google Scholar]
8. Hu M, Xiao M, Xu W, Deng T, Dong Y, Peng K. Traffic engineering for software-defined LEO constellations. IEEE Trans Netw Serv Manage. 2022;19(4):5090–103. doi:10.1109/tnsm.2022.3186716. [Google Scholar] [CrossRef]
9. Du J, Jiang C, Qian Y, Han Z, Ren Y. Resource allocation with video traffic prediction in cloud-based space systems. IEEE Trans Multimed. 2016;18(5):820–30. doi:10.1109/tmm.2016.2537781. [Google Scholar] [CrossRef]
10. Nie L, Ning Z, Obaidat MS, Sadoun B, Wang H, Li S, et al. A reinforcement learning-based network traffic prediction mechanism in intelligent Internet of things. IEEE Trans Ind Inform. 2021;17(3):2169–80. doi:10.1109/tii.2020.3004232. [Google Scholar] [CrossRef]
11. Peng L, Yan J, Wei P, Wang X. Spatio-temporal correlation-based incomplete time-series traffic prediction for LEO satellite networks. Front Inf Technol Electron Eng. 2025;26(5):788–804. doi:10.1631/fitee.2300873. [Google Scholar] [CrossRef]
12. Han Y, Li D, Guo Q, Wang Z, Kong D. Self-similar traffic prediction scheme based on wavelet transform for satellite internet services. In: 2017 International Conference on Machine Learning and Intelligent Communications. Cham, Switzerland: Springer; 2017. p. 189–97. [Google Scholar]
13. Katris C, Daskalaki S. Dynamic bandwidth allocation for video traffic using FARIMA-based forecasting models. J Netw Syst Manag. 2019;27(1):39–65. doi:10.1007/s10922-018-9456-1. [Google Scholar] [CrossRef]
14. Nie L, Wang X, Wang S, Ning Z, Obaidat MS, Sadoun B, et al. Network traffic prediction in industrial Internet of things backbone networks: a multitask learning mechanism. IEEE Trans Ind Inform. 2021;17(10):7123–32. doi:10.1109/tii.2021.3050041. [Google Scholar] [CrossRef]
15. Mohamed SAA, Kurnaz S. Classified VPN network traffic flow using time related to artificial neural network. Comput Mater Contin. 2024;80(1):819–41. doi:10.32604/cmc.2024.050474. [Google Scholar] [CrossRef]
16. Tamada K, Kawamoto Y, Kato N. Bandwidth usage reduction by traffic prediction using transfer learning in satellite communication systems. IEEE Trans Veh Technol. 2024;73(5):7459–63. doi:10.1109/tvt.2023.3341442. [Google Scholar] [CrossRef]
17. Ju Y, Song J, Li W, Zhang Y, He C, Dong F, et al. Dynamic load-balancing routing strategy for LEO satellite networks based on spatio-temporal traffic prediction. IEEE Tran Aerospace Electron Syst. 2025;61(5):11954–70. doi:10.1109/iceiec.2015.7284498. [Google Scholar] [CrossRef]
18. Mokhtar H, Di X, Jiang Z, Chen J, Hassan A. Efficient spatiotemporal prediction transformer for cooperative satellite remote sensing. IEEE Trans Netw Service Manage. 2025;22(5):4732–46. doi:10.1109/tnsm.2025.3580444. [Google Scholar] [CrossRef]
19. Zhou W, Qian Y, Zhao K, Li W, Chen F. Satellite traffic forecast based on multi-dimensional periodic features. In: Wireless and Satellite systems. Vol. 410. Cham, Switzerland: Springer; 2022. p. 267–77. doi:10.1007/978-3-030-93398-2_27. [Google Scholar] [CrossRef]
20. Chen C, Sun C, Li H, Jin F, Pei Q, Wan S. ST-GAGCN-LEO: a spatiotemporal graph attention and gated convolutional network for LEO satellite traffic prediction. IEEE Trans Aerospace Electron Syst. 2025;61(4):9669–85. [Google Scholar]
21. Wang S, Nie L, Li G, Wu Y, Ning Z. A multitask learning-based network traffic prediction approach for SDN-enabled industrial Internet of things. IEEE Trans Ind Inform. 2022;18(11):7475–83. doi:10.1109/tii.2022.3141743. [Google Scholar] [CrossRef]
22. Zhang C, Fiore M, Patras P. Multi-service mobile traffic forecasting via convolutional long short-term memories. In: 2019 IEEE International Symposium on Measurements & Networking (M&N). Piscataway, NJ, USA: IEEE; 2019. p. 1–6. [Google Scholar]
23. Xu L, Liu H, Song J, Li R, Hu Y, Zhou X, et al. TransMUSE: transferable traffic prediction in multi-service edge networks. Comput Netw. 2023;221:109518. [Google Scholar] [PubMed]
24. Zhang Y, Han N, Zhu T, Zhang J, Ye M, Dou S, et al. Prophet: traffic engineering-centric traffic matrix prediction. IEEE/ACM Trans Netw. 2024;32(1):822–32. [Google Scholar]
25. Ran Y, Ding Y, Chen S, Lei J, Luo J. Fully-distributed dynamic packet routing for LEO satellite networks: a GNN-enhanced multi-agent reinforcement learning approach. IEEE Trans Veh Technol. 2025;74(3):5229–34. doi:10.1109/tvt.2024.3499933. [Google Scholar] [CrossRef]
26. Gong L, Chen Q, Yang L, Yin Z, Wang Y. Autonomous traffic prediction for LEO satellite-based IoT based on satellite spatiotemporal features mapping. IEEE Internet Things J. 2025;12(14):27021–32. doi:10.1109/jiot.2025.3562631. [Google Scholar] [CrossRef]
27. Zhao Y, Wang N, Chen Q, Yu S, Chen X. Satellite coverage traffic volume prediction using a new surrogate model. Acta Astronaut. 2022;193:357–69. doi:10.1016/j.actaastro.2022.01.026. [Google Scholar] [CrossRef]
28. Fan J, Weng W, Chen Q, Wu H, Wu J. PDG2Seq: periodic dynamic graph to sequence model for traffic flow prediction. Neural Netw. 2025;183:106941. doi:10.1016/j.neunet.2024.106941. [Google Scholar] [PubMed] [CrossRef]
29. Gasteiger J, Weißenberger S, Günnemann S. Diffusion improves graph learning. In: Advances in neural information processing systems. Red Hook, NY, USA: Curran Associates, Inc.; 2019. [Google Scholar]
30. State Council of the People’s Republic of China. China launches three-body computing constellation for AI in space. 2025 [cited 2026 Jan 26]. Available from: http://english.www.gov.cn/news/202505/15/content_WS6825452ec6d0868f4e8f28e6.html. [Google Scholar]
31. Starcloud, Inc. Starcloud-1 satellite. 2025 [cited 2026 Jan 26]. Available from: https://www.starcloud.com/starcloud-1. [Google Scholar]
32. Tan H, Zhu L. A novel routing algorithm based on virtual topology snapshot in LEO satellite networks. In: 2014 IEEE 17th International Conference on Computational Science and Engineering. Piscataway, NJ, USA: IEEE; 2014. p. 357–61. [Google Scholar]
33. Werner M. A dynamic routing concept for ATM-based satellite personal communication networks. IEEE J Sel Areas Commun. 1997;15(8):1636–48. [Google Scholar]
34. Yang Y, Xu M, Wang D, Wang Y. Towards energy-efficient routing in satellite networks. IEEE J Sel Areas Commun. 2016;34(12):3869–86. doi:10.1109/jsac.2016.2611860. [Google Scholar] [CrossRef]
35. Wang W, Wang C, Wang H, Xu P. Dynamic cache allocation routing strategy of internet of things satellite node based on traffic prediction. J Commun. 2020;41(2):25–35. doi:10.1109/iceib53692.2021.9686397. [Google Scholar] [CrossRef]
36. Cong L, Shi B, Di X, Ding H, Chen C. Research on satellite network traffic prediction algorithm based on gray wolf algorithm optimizing GRU and spatiotemporal analysis. In: 2023 15th International Conference on Communication Software and Networks. Piscataway, NJ, USA: IEEE; 2023. p. 123–31. [Google Scholar]
37. Cao M, Liu J, Zhi J, Gong P, Wang J, Wu Z. TLS-net: a hybrid time series prediction model combining TCN and LSTM for ship-satellite network traffic. In: 2023 7th International Conference on Transportation Information and Safety. Piscataway, NJ, USA: IEEE; 2023. p. 1168–73. [Google Scholar]
38. Zhao L, Song Y, Zhang C, Liu Y, Wang P, Lin T, et al. T-GCN: a temporal graph convolutional network for traffic prediction. IEEE Trans Intell Transp Syst. 2020;21(9):3848–58. [Google Scholar]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools