Multi-Source Traffic Information Completion and Perception Method via Graph Convolutional Neural Networks in Intelligent Connected Transportation System

Pangwei Wang; Jie Wang; Zipeng Wang; Hangrui Dong; Li Wang

doi:10.32604/cmc.2026.080815

icon Open Access

ARTICLE

Multi-Source Traffic Information Completion and Perception Method via Graph Convolutional Neural Networks in Intelligent Connected Transportation System

Pangwei Wang^1,*, Jie Wang¹, Zipeng Wang¹, Hangrui Dong², Li Wang¹

1 Beijing Key Lab of Urban Intelligent Traffic Control Technology, North China University of Technology, Beijing, China
2 Beijing Connected and Autonomous Vehicles Technology Co., Ltd., Beijing, China

* Corresponding Author: Pangwei Wang. Email: email

(This article belongs to the Special Issue: Intelligent Transportation System (ITS) Safety and Security)

Computers, Materials & Continua 2026, 88(2), 58 https://doi.org/10.32604/cmc.2026.080815

Received 15 February 2026; Accepted 27 April 2026; Issue published 15 June 2026

Abstract

Traffic holographic perception refers to the real-time, high-fidelity, and multi-dimensional sensing of traffic states through the fusion of heterogeneous sensors, including cameras, radars, and connected vehicle data. The multi-source perception data obtained thereby can provide a complete digital representation of the road network for the Intelligent Transportation System (ITS). However, sensors are vulnerable to environmental interference, which can result in data loss at specific points or along arterial highways for certain periods, potentially undermining system safety and decision-making reliability. To address these challenges, a deep learning method based on Graph Convolutional Networks (GCN) and Gated Recurrent Units (GRU) is proposed, leveraging Artificial Intelligence (AI) and intelligent connected technologies for real-time acquisition of multi-sensor perception data. A feature-level fusion integrates multi-source perception data. GCN captures spatial dependencies from the road network topology, while GRU extracts temporal features from time series, enabling accurate imputation of missing traffic data. The method is evaluated at intelligent connected intersections in the Beijing High-level Autonomous Driving Demonstration Area. Results show that the accuracy of long-term traffic state completion reaches 89.36%, and the Root Mean Square Error (RMSE) is reduced by 17.2% compared to the Long Short-Term Memory (LSTM) baseline. This framework provides a practical solution for deploying traffic holographic perception technology in secure and trustworthy ITS.

Keywords

Intelligent transportation; information security; traffic information completion; traffic holographic perception; AI-driven; edge computing; graph convolutional neural network

1 Introduction

Offloading traffic status recognition tasks to edge devices markedly enhances the efficiency of real-time data processing for multi-source traffic perception under challenges such as data loss and day-night domain shifts. Data-driven approaches and intelligent algorithms are designed to support multi-granularity and large-scale identification and evaluation of traffic conditions. Recent studies have also highlighted the rapid advancement of holographic sensing technologies for real-time traffic state monitoring, further reinforcing the momentum in this field. Reliable traffic data are essential for optimizing control strategies, alleviating congestion, and ensuring the safety of ITS. In modern ITS, multi-sensor data from cameras, radar, and connected vehicles enable comprehensive traffic perception [1]. However, adverse weather, accidents, or equipment failures often cause sensor instability and data loss, which threaten system reliability and decision-making. Therefore, robust real-time data completion methods at urban intersections are urgently needed to enhance data integrity and support trustworthy AI-driven traffic perception.

Mobile Edge Computing (MEC) brings computing resources closer to network edges, enabling low-latency processing of multi-sensor data [2]. Yet heterogeneous data from cameras, radars, and connected vehicles vary in format and frequency [3]. MEC can fuse such data while leveraging road network topology, providing a foundation for distributed data completion.

For short-term data completion, time series prediction models are often used to estimate missing values [4,5]. For long-term completion, deep learning models that capture spatiotemporal traffic patterns are preferred for their accuracy and stability [6]. Combining both strategies can address data loss across different time scales more effectively.

Three primary considerations are involved in addressing the issue of missing traffic data [7,8]: (1) Spatial correlation: traffic states are influenced by adjacent road segments due to network topology; (2) Temporal correlation: traffic patterns exhibit daily periodicity and short-term continuity; (3) Spatiotemporal correlation: Spatial dependencies evolve over time, and temporal patterns vary across different locations—their interaction is key to accurate data completion.

Early efforts to exploit spatial correlations in traffic data relied on tensor decomposition methods, which leverage the low-rank structure of network data to impute missing values [9,10]. However, these approaches struggle to capture the nonlinear and graph-structured dependencies inherent in road networks. To address this, recent studies have introduced graph-based models. For instance, graph proximity learning [11] and attention-enhanced stacked graphs [12] have been proposed to better model spatial relationships. Nevertheless, these methods either assume static graph structures or require complete training data, limiting their effectiveness in dynamic, real-world scenarios with prolonged data loss.

Traffic state variations typically exhibit temporal periodicity, and their dynamic flow characteristics differ under various temporal conditions. To further capture complex temporal patterns under varying data distributions, researchers have enhanced Recurrent Neural Networks (RNNs) with techniques such as Dynamic Time Warping (DTW), attention mechanisms, and transfer learning [13,14]. For instance, DTW has been integrated into LSTM to fine-tune temporal features under different distributions [15], and self-attention mechanisms combined with transfer learning have been used to correct interpolated values [13]. While these methods improve temporal modeling, they primarily focus on the time dimension alone, often overlooking the rich spatial structure of the road network.

Other deep learning approaches have also driven neural networks to effectively capture the correlated characteristics of geographical regions and temporal patterns [16]. In addition, methods such as autoencoders and causal convolution with attention [17], have also been explored for traffic data imputation under various conditions.

In summary, probabilistic inference and tensor-based statistical interpolation methods are limited in fully capturing the complex correlations between road topology and traffic network time series, primarily due to the nonlinear nature of traffic flow data. In large-scale road networks, data missing issues such as gradient vanishing may significantly affect completion accuracy [18,19]. The integration of intelligent network technologies in transportation has introduced novel technical approaches for the information interaction between devices, holographic sensing, and data completion. Besides, multi-modal traffic data offers a robust computational basis for effective data completion.

Therefore, to address the above challenges and to meet the pressing demands for safe and trustworthy intelligent transportation, an AI-driven traffic information completion and perception method based on deep neural networks is proposed. More specifically, a GCN is employed to capture the spatial features of the road network, while a GRU is used to extract long-term cyclical patterns in the temporal dimension. The proposed method, Hierarchical Dynamic Adaptive Graph Convolutional Neural Network (HD-AGCN), is further enhanced by incorporating the Window Adaptive Self-Attention Imputation (WA-SAI) algorithm for historical data correction. To mitigate the influence of low-confidence data on model performance, the HD-AGCN model is designed for robust long-term sequence completion to ensure data integrity and support reliable traffic holographic perception under various adverse conditions.

2 Problem Formulation

The urban road network traffic perception system is realized through intelligent networked equipment. To ensure reliable perception, an effective data completion method is essential to mitigate data losses caused by network interference or equipment failures. In this work, we propose a method that completes missing data for one or more intersections using surrounding intersection data and road network topology. A traffic subarea is defined as an urban region with similar macro traffic flow characteristics, where each intersection is equipped with an MEC node. MEC collects and fuses feature-level multi-source perception data, such as average speed, queue length, and travel time. Urban intersections are characterized as adjacent or non-adjacent nodes that share similar traffic flow patterns. While graph modeling and completion model training use data from all intersections, completion for a target intersection relies solely on its neighboring intersections. The real-time holographic traffic sensing system is illustrated in Fig. 1.

images

Figure 1: The diagram of the traffic perception system in an intelligent connected traffic scenario.

To model the urban regional road network, a weighted undirected graph is constructed wherein each node represents an intersection. A set of traffic features derived from both vehicle-side and roadside intelligent traffic equipment is collected by nodes through Vehicle-to-Everything (V2X) communications, such as vehicle heading angle and three-dimensional coordinates. This graph-based representation can effectively capture the spatial topology and dynamic traffic characteristics essential for subsequent data completion tasks. The mathematical modeling process can be described as below:

At time t, the weighted undirected graph G represents the traffic state of the regional network:

{G=(V,E,X)V={v1,v2,…,vn}E={eij}(1)

where V is the set of urban regional road network intersections, and E represents a group of interconnected edges eij between each vertex vi and vj. X represents the feature matrix, which is defined as the set of different traffic states perceived by the side of the intersection.

When performing data completion, the observed value X of p historical time steps and the complete value Y of q time steps are defined as follows:

X=(xt1,xt2,…,xtp)∈Rp×n×c(2)

Y=(ytp+1,ytp+2,…,ytp+q)∈Rq×n×c(3)

where x and y represent the observed value in the perception dataset and the complete value in the complete dataset, respectively. n is the number of nodes, and c is the number of traffic characteristics such as traffic flow, traffic speed, and queue length, etc.

3 Methodology

In this section, a short-term data correction method and a network-level long-term missing data completion method are proposed to construct a holographic traffic state perception dataset. The technical architecture of the data completion method is illustrated in Fig. 2.

images

Figure 2: The structure of the data completion method.

This structure presents a comprehensive methodology for constructing a real-time holographic traffic state perception dataset through multi-source sensing and AI-driven information completion techniques. Initially, individual vehicle data are collected through roadside sensors and transmitted to the MEC platform to process and analyze traffic flow status. Then, feature-level fusion of sensing data is carried out to ensure real-time and complete accuracy and completeness in urban intersection traffic detection. To address short-term data gaps caused by network fluctuation or sensor disturbance, the WA-SAI algorithm is used for data correction. For long-term missing data, a Convolutional Neural Network (CNN) algorithm is utilized to complete traffic state data within long time series graphs. This integrated approach effectively addresses both long-term and short-term deficiencies, constructing a real-time holographic traffic state perception dataset. The modeling process of this method is presented in Fig. 3.

images

Figure 3: The modeling process of the data completion method.

3.1 Multi-Source Information Correction Method Based on WA-SAI

The segmented quartile method is used to identify errors or missing short-period time series data caused by network data loss. Then the WA-SAI model is used for continuity interpolation.

(1) Processing data outliers based on the segmented quartile method

The piecewise quartile method arranges the traffic state data in ascending order and divides them into four parts on average. The quartile distance QIQR is expressed as:

QIQR=Q3−Q1(4)

where the lower quartiles Q1, the median quartile Q2 and the upper quartile Q3 represent the mean breakpoint values, corresponding to the value x when the probability density function inverse function y = 0.25, y = 0.5 and y = 0.75 after linear regression of the dataset, respectively. The criteria for determining outliers in this method are based on the quartile and QIQR, and the outliers are identified as less than or more than the lower quartile or upper quartile 0.5QIQR values in this paper, since traffic flow data are highly dynamic and real-time in nature. Outliers may be caused by instantaneous sensor interference, communication delays, or similar factors. Adopting a stricter threshold helps filter out such noise early, enabling more sensitive detection of these transient, minor anomalies. This prevents them from affecting subsequent data fusion and completion, thereby improving the robustness of the processing.

Taking the average speed of connected vehicles (hereinafter referred to as speed) obtained by V2X as an example, the dynamic identification process of abnormal speed data is described as follows:

All historical data are divided into N intervals according to time series, and the length of each interval is determined as Eq. (5):

k=PQmax−Qmin(P∈[p1,p2])(5)

where k represents the time step of the interval, Qmax and Qmin represent the maximum and minimum values of the data in the interval, p1 and p2 represent the minimum and maximum values of the area of the interval, which are determined manually according to the field situation.

The two-dimensional data matrix Uj=[vl,k,tl,k]T(l=1,2,…,n,k=1,2,…,q) is composed of vehicle speed v and their corresponding sampling time data within the l-th time window, where k represents the index of data pairs in the time series. The matrix vj=[vj,1,vj,2,…,vj,q] is derived by sorting the vehicle speed data from Uj in ascending order, where vj,1≤vj,2≤…≤vj,q, q indicates the largest number after sorting.

The numbers Ql,1 and Ql,3 at the 25th and 75th percentiles are taken as the quartiles. If q is odd, it is calculated as the average of the data points on either side of the corresponding position.

For the l-th time window, the data value below Vl,L and above Vl,U are considered outliers and removed. These outliers are denoted as vB, with B representing the corresponding timestamps of the removed values. Furthermore, the lower limit Vl,L and upper limit Vl,U of outlier identification are expressed as:

{Vl,L=Ql,1−0.5Ql,IQRVl,U=Ql,3+0.5Ql,IQR(6)

(2) Short-term interpolation correction based on WA-SAI

If the identified abnormal data is directly eliminated, the continuity of the dataset in the time series will be destroyed. Therefore, a self-attention imputation for time series with a data correction mechanism is proposed to correct the identified abnormal data in the collection process, as shown in Fig. 4.

images

Figure 4: Model structure of window adaptive self-attention imputation (WA-SAI).

For the divided V2X multi-source data, the input data Xt and its missing mask Mt are concatenated and fed into the WA-SAI model, where relative window-normalized positional encoding and diagonally-masked self-attention are employed to capture local intra-window dependencies, while cross-window attention aggregates global inter-window patterns. The interpolation unit of WA-SAI can thereby achieve dynamic correction of the traffic data for each window, producing a continuous and consistent time series. Additionally, the missing mask is represented as Eq. (7):

Mtd={1if Xtdis observed0if Xtdis missing(7)

where Xtd denotes the d-th element value of Xt, and Mtd is the missing mask of Xtd.

For the l-th time window, the absolute positional encoding is replaced with relative window-normalized positional encoding, which allows the model to retain consistent temporal perception across time windows of different lengths k. The relative position posrel is defined as:

posnorm=posk∈[0,1](8)

α=sigmoid[Var(Xl)Varmax]∈[0,1](9)

posrel=α⋅posnorm+(1−α)⋅pos(10)

where pos denotes the time-step position within the time window, posnorm is the normalized position, and k is the length of the current window. α measures the variability of the data within the time window, Var(Xl) is the variance of the data in the time window, and Varmax is the maximum variance across all time windows.

The sine and cosine functions of the relative positional encoding are formulated as:

PE(posrel,2i)=sin⁡(posrel100002idmodel),PE(posrel,2i+1)=cos⁡(posrel100002idmodel)(11)

where dmodel denotes the embedding dimension, representing the length of the relative positional encoding vector, i.e., the dimensionality of the vector space into which the positional information of each time step is mapped. This dimension determines the frequency division scale of the sine/cosine functions, enabling different dimensions to encode positional patterns ranging from short-period to long-period cycles.

The input data Xl of the current time window and its missing mask Ml are concatenated to form the model input. This concatenated representation is then summed with the relative positional encoding prel yielding the embedded representation El. Subsequently, the diagonally masked self-attention mechanism is leveraged to derive the local feature sequence representation Fllocal within the window. This process can be formulated as:

El=[Concat(Xl,Ml)We+be]+prel(12)

[DiagMask(x)](i,j)={−∞i=jx(i,j)i≠j(13)

DiagMaskedSelfAttention(x)=softmax[DiagMask((xWQ)(xWK)Tdk)](xWV)(14)

Fllocal=LayerNorm[El+DiagMaskedSelfAttention(El)](15)

where We is the weight matrix, be is the bias vector, and DiagMasked (x) is the diagonal mask matrix that sets the diagonal entries of the attention map to negative infinity, ensuring that the estimation at each time step does not depend on itself. WQ, WK and WV are three independent learnable parameter matrices, dk is the dimension of the Query vector and Key vector in the attention mechanism.

To aggregate features across different time windows, the local features are pooled, resulting in the window-level representation hl=Pool(Fllocal). This representation serves to calculate the global context vector cl by attending to a set of historical window representations Hhist = [hl−1, hl−2, …]. Then cl is broadcast to align its dimension with local feature sequence representation Fllocal, and processed by residual connection and layer normalization with Fllocal generating the global feature sequence representation Flglobal. It can be expressed as follows:

cl=Attention(hl,Hhist)=softmax((hlWq)(HhistWk)Td)(HhistWv)(16)

Flglobal=LayerNorm[Fllocal+Broadcast(cl)](17)

where Wq, Wk and Wv are also three independent learnable parameter matrices, d is the dimension of Query vector and Key vector in attention mechanism.

Finally, the global feature sequence representation Flglobal is passed through a feedforward neural network (FFN) layer, and the output feature Fout is derived via the same residual connection and normalization process, and then linearly transformed and mapped back to the original feature space to obtain the corrected window data Yl:

Fout=LayerNorm[Flglobal+FNN(Flglobal)](18)

X~l=FoutWo+bo(19)

Yl=Ml⊙Xl+(1−Ml)⊙X~l(20)

where X~l is completed data, and Wo is the weight matrix, bo is the bias vector.

Using the above methods, missing values across time windows are continuously inferred via the local-global attention mechanism, enabling rapid interpolation of random gaps in large-scale multi-sensor data. This improves dataset quality and supports long-term traffic state completion.

3.2 Multi-Source Information Completion Based on HD-AGCN

Spatial and temporal correlations are fundamental in studying the relationship between traffic data and road networks. Leveraging deep learning theories, the HD-AGCN model integrates graph convolutional networks and gated recurrent units to learn these spatiotemporal dependencies from traffic data. It is applied to the completion of long-term missing traffic data at urban road network intersections. The structure of the corrected algorithm is shown in Fig. 5.

images

Figure 5: The structure of the complete algorithm. Especially, “Missing” refers to labeling outlier or missing data entries, and “Valid” refers to labeling data validated against errors and missing. Both are necessary preprocessing steps before the data are passed to the WA-SAI unit or the GRU/GCN-based completion unit for imputation.

Based on the filter constructed in the GCN model, the spatial features between the network nodes are captured, and the topology of the network is encoded to obtain the spatial relationship. According to the road network graph model G, each intersection is regarded as a node, where V is the set of road nodes, and E represents the set of edges between vertices. Adjacency matrix A is used to represent the adjacency relationship between intersections, A∈RN×N. Eigenmatrix X∈RN×P is used to represent the attribute characteristics of nodes in the road network, that is, intersection traffic status data, X∈RN×P, where N represents the number of nodes and P represents the number of node features in the road network.

The goal of the model is to complete the missing long-term traffic data caused by equipment power failure or network interruption within a certain period of time-based on the corrected traffic data of the road, including traffic speed, traffic flow, and traffic density. The average traffic speed and the average queue length are considered in the test of traffic data completion scenarios. The feature matrix of the input layer is defined as X, according to the road network diagram G, where Xt∈RN×i represents the average traffic speed of the road node at time i.

The average traffic speed at the road network intersection during the time period k is:

V¯=∑ikv¯tk(21)

where v¯t is the average speed of all connected vehicles within the intersection perception range at time t.

To capture spatial dependencies, we employ a two-layer GCN as proposed by Kipf & Welling [20]. The first GCN module aggregates features from each node and its immediate neighbors, while the second module further propagates information to second-order neighbors. The propagation rule follows the standard form:

HΓ,t(l+1)=σ(D~−1/2A~D~−1/2X(l)H(l))(22)

where Ht(l+1) is the characteristic representation of node Γ, and Γ includes the target node A and the first-order neighbor B. A~=A+I is the normalized adjacency matrix with self-loops, I is the identity matrix. D~ is the corresponding degree matrix of A~. X(l) is the input feature matrix. While the first GCN module is carried out to extract the spatial relationship between each node and its directly adjacent neighbors, X(0) includes the features of the target node A and its first-order neighbor B, H(0) is the weight matrix of the first layer GCN. And the second GCN module extracts periodically features of data with low time variation. Then X(1) includes the features of the first-order neighbor B node and the other neighbor C or the features of the target node A and its first-order neighbor node B. H(1) is the weight matrix of the second layer GCN. σ is the activation function (such as ReLU).

The feature matrix obtained by the second GCN module remains unchanged within the set period, which is manually determined based on the completion time. In contrast, the first GCN module dynamically extracts evolving spatial relationship features in real-time, adapting to changes in the input data. The final process of splicing the spatial feature matrix at time t is expressed as follows:

Ht=Concat(HB,t(1),HA,t(2))(23)

To model temporal dynamics, a GRU is integrated into the framework according to the method proposed by Cho [21], shown in Fig. 6. The GRU updates the hidden state as:

Ut=σ(Wu[f(Xt,HB,t,HA,t),Ot−1]+bu)(24)

Rt=σ(Wr[f(Xt,HB,t,HA,t),Ot−1]+br)(25)

Ct=tanh(Wc[f(Xt,HB,t,HA,t),(rt∗Ot−1)]+bc)(26)

Ot=ut∗Ot−1+(1−ut)∗c(27)

where f(Xt,HB,t,HA,t)=Concat(Xt,Ht) represents the concatenation process of graph convolution. W and b represent the weight matrix and bias vector in the training process, and generate the completed road network holographic traffic state dataset Yt, which contains the average traffic speed, average traffic flow, speed and other traffic data.

images

Figure 6: The model structure of the long-term completion. Rt is the reset gate, which is used to control the degree of ignoring the status data of the previous time. Ut is the update gate, which is used to control the neglect degree of the prior state. Ct is the content stored at time t. Ot−1 represents the output value at time t − 1. Ot is the output value at time t. tanh is the activation function. The completed result is Yt.

The model training aims to minimize the error between the observed value and the completed value of traffic data. Xt, represented by x, is used to store observation data. Yt stores the complete data, represented by x^. Complete data computed by the fully connected layer and the loss function of the HD-AGCN model is shown in Eqs. (28) and (29). The second term L2REG represents the application of deep learning L2 regularization to mitigate model overfitting, and λ serves as a hyperparameter controlling the strength of the regulation.

Yt=Wy∗Ot+by(28)

Loss=‖xi−x^i‖+λL2REG(29)

Through the spatial correlation obtained by GCN learning in complex topological structure of urban network nodes, and the time correlation received from the gated cycle unit learning in the dynamic changes of traffic data, the constructed HD-AGCN model realizes the real-time completion of large-scale long-term continuous missing data and build a complete traffic state holographic sensing dataset.

4 Experiments and Results Analysis

4.1 Field Test Environment and Parameter Setting

To validate the proposed method, a testbed was deployed at multiple intersections in the Beijing High-level Autonomous Driving Demonstration Area, as shown in Fig. 7. Ten intersections equipped with roadside sensors and Road Side Units (RSUs) were selected, with the intersection at Kechuang 1st Street and Jinghai 2nd Road designated as the missing data node. Firstly, V2X communication collected connected vehicle data over five working days at 10 frames per second (fps). Secondly, multi-source perception data were temporally aligned to the MEC system clock using linear interpolation. Spatial alignment transformed all detections into a unified Cartesian coordinate system via camera-radar calibration parameters. Finally, within each fusion window, the average traffic characteristics were computed as the arithmetic mean of all valid vehicle data from all sensors after outlier removal. Average traffic speed and queue length were selected as fusion targets at the feature level. Key test parameters are summarized in Tables 1 and 2.

images

Figure 7: Field-tested scenario of Beijing high-level autonomous driving demonstration area.

images

The dataset was split into 80% training and 20% testing. Training used PyTorch 2.1 on CUDA 12.1 with an exponential learning rate decay from 0.1 to 0.001, gradient clipping (max norm 40), and 3000 epochs. Performance was evaluated in three scenarios: (1) short-term interpolation of individual vehicle data using WA-SAI; (2) single-intersection completion for 5-min, 30-min, and 2-h gaps using HD-AGCN; (3) multi-intersection completion under simultaneous data loss.

4.2 Analysis of Abnormal Data Identification and Correction

To verify that WA-SAI’s reliability is independent of the chosen time window, a random period was selected. Using V2X-connected vehicle speed data, 2000 observed data sets were collected and fused. The segmented quartile method identified 234 abnormal speed data sets. The process of identifying and eliminating abnormal speed data is illustrated in Figs. 8–10.

images

Figure 8: Initial V2X network data.

images

Figure 9: Pre-processing results of abnormal data.

images

Figure 10: Results of data correction.

Directly eliminating abnormal data disrupts the continuity of the dataset in the time series. The proposed data correction algorithm (WA-SAI) is employed after removing the identified abnormal values, as illustrated in Fig. 10. Using the weighted average of the difference between the observed values and the corresponding interpolated values, the overall error is approximately 2.875%.

4.3 Analysis the Results of the Missing Data Completion Method

Based on the above steps, HD-AGCN is used to perform missing data completion tests on the average traffic speed and average queue length of the urban road network. The test focused on traffic data completion scenarios with missing intervals of 5 min, 30 min, and 2 h. Fig. 11 shows the average traffic speed and average queue length data at different time granularities and durations. The completion effect diagrams obtained from the trained model illustrate the fitting effect between the observed values and the completion values.

images

Figure 11: Completion results of average traffic speed.

The acquisition period of the first test group lasts 5 h, with data fusion time granularities of 5 s/frame and 20 s/frame. Periods demonstrating optimal perception and fusion performance at non-target intersections are considered as valid periods for data completion. During non-completion periods, short-term random deletions are introduced at the target intersection, totaling 82 frames, which equates to a short-term deletion rate of approximately 4.1%, with a maximum continuous deletion of 4 frames.

To create long-term missing scenarios, we manually removed data segments from the target period. For the 5-s granularity, two segments of 60 frames and 358 frames were removed, representing 5-min and 30-min completion intervals, respectively, as shown in Fig. 11a. For the 20-s granularity, 357 frames were removed, creating a 2-h completion interval, as shown in Fig. 11b. The resulting long-term missing rates were approximately 1.6%, 9.9%, and 12.89%, respectively.

Test results show that under the same time granularity, short-term completion covers one congestion wave period, while long-term completion covers three periods. Despite different fluctuation amplitudes, HD-AGCN effectively tracked the actual observations, achieving comprehensive errors of about 8.2% for the 5-min completion, 9.8% for the 30-min completion, and 10.6% for the 2-h completion. The model reliably handles abrupt changes and continuity in real traffic conditions, further proving its spatiotemporal tracking capability.

In Fig. 12, Node 266, using only one GCN layer as an ablation experiment, yields a completion error of 37.2%. In contrast, Nodes 269, 273, and 274, which employ two GCN layers, achieve errors of 12.6%, 9.3%, and 3.72%. The higher accuracy for the latter three nodes is attributed to their better connectivity: each connects to the network via a single secondary trunk or branch road, whereas Node 266 connects via two secondary trunk roads and has relatively larger distances to its first- and second-order neighbors. These results indicate that completion accuracy improves with more adjacent nodes and closer proximity. Overall, the algorithm meets practical requirements, with lower errors for well-connected intersections.

images

Figure 12: Completion results of average traffic volume at four typical intersections (target nodes 266, 269, 273, 274). The geographic locations and basic configurations of these intersections are marked in Fig. 7. A 30-s time window is used for feature-level fusion, and 783 consecutive sampling points (approx. 6.5 h) are artificially removed to create missing data scenarios. Solid lines denote observed ground-truth traffic flow, and dashed lines denote values completed by the HD-AGCN model. One layer GCN module type only integrates the feature of first-order neighbors, and two layers GCN module type integrates the features of first-order and second-order neighbors.

Compared with average traffic speed completion (Fig. 11), average traffic volume completion (Fig. 12) entails longer duration, higher volatility, and non-periodic behavior. As shown in Fig. 12, HD-AGCN effectively captures dynamic intersection correlations via its hierarchical adaptive GCN-GRU mechanism, markedly improving accuracy in complex traffic settings. Speed completion results indicate that at fine 5 s/frame and 20 s/frame granularities, imputed data preserve high detail fidelity, with errors notably reduced under high-frequency sampling, ensuring continuity in speed and flow. For traffic volume completion, HD-AGCN’s accuracy advantage depends on node connectivity, proximity, and hierarchical GCN layer configuration, making it especially apt for network-level imputation. Relative to conventional methods, HD-AGCN provides enhanced adaptability and stability across large-scale networks. In sum, HD-AGCN exhibits strong robustness in synchronous multi-intersection processing, effectively meeting practical demands and offering an optimal solution for multi-node road network completion.

4.4 Reliability Analysis of the Complete Model

Based on the above tests, the reliability comparisons are conducted between the data correction algorithm with the self-attention mechanism (WA-SAI) and methods including Mean Imputation (MI) [22], Random Imputation (RI) [23], k-Nearest Neighbor (KNN) [24], Generative Adversarial Networks (GAN) [25] and Long-Short Term Transformer-based Network (LSTTN) [26]. The results are shown in Table 3.

images

These test results indicate that in the interpolation test of 234 sets of speed data, WA-SAI achieves the smallest PSI, MSE, and RA values, and the largest CC value. Specifically, population stability index (PSI), mean square error (MSE), and relative accuracy (RA) values are reduced by 8.3%, 17.7%, and 4.4%, respectively, compared to the optimal method. The correlation coefficient (CC) value is increased by 1.6% compared to the state-of-the-art method, LSTTN. These comparative results demonstrate that the revised algorithm offers higher data reliability.

Furthermore, comparative tests on the reliability of completion between HD-AGCN and the methods LSTTN [26], GCN [27], LSTM [28], GRU [29], and Support Vector Regression (SVR) [30] are conducted. The test focuses on average traffic speed completion over two periods, 5 min and 15 min. The results obtained from testing different methods are presented in Table 4.

images

As shown in Table 4, for completion periods of 5 min, 30 min, and 2 h, the root mean square error (RMSE) and mean absolute error (MAE) decrease while accuracy and coefficient of determination (R2) increase with longer periods. HD-AGCN consistently achieves the best performance across all metrics. Compared to the optimal baseline, HD-AGCN reduces RMSE by 3.4% and MAE by 1.1%, and increases accuracy by 4.3% and R2 by 0.4%.

In the 5-min test, HD-AGCN reduces RMSE by 56.4% relative to the spatial-only GCN model; at 30 min, the reduction is 55.1%, confirming the benefit of incorporating temporal information. Against the temporal-only GRU and LSTM models, HD-AGCN reduces RMSE by 3.7% and 2.8% at 5 min, and by 3.1% and 4.3% at 30 min, demonstrating the advantage of integrating spatial information. In the 2-h test, HD-AGCN again achieves the smallest error and highest accuracy. These results validate that HD-AGCN effectively leverages both spatial and temporal dependencies, offering high reliability for long-term traffic data completion in large-scale urban networks.

5 Conclusion

In this paper, a real-time traffic state completion method is presented for urban road networks using edge computing and AI-driven deep learning. The proposed HD-AGCN model integrates GCN and GRU to capture spatiotemporal correlations of traffic flow, enabling high-fidelity holographic perception. The model is trained to generate long-term completed data for missing traffic states, enhancing data completeness and continuity, thereby improving the reliability of ITS.

Experimental results show that the completed data closely matches real-time traffic conditions without significant oscillation or overfitting. The method effectively addresses long-term continuous data gaps by leveraging data from adjacent intersections or those with similar traffic flow characteristics. Moreover, the integration of a short-term self-attention interpolation model corrects sensor errors, ensuring completion accuracy. Future work will extend this method to develop a traffic efficiency evaluation index for urban trunk roads and explore its application in security-critical scenarios.

Acknowledgement: The authors would like to express their gratitude to North China University of Technology and Beijing Connected and Autonomous Vehicles Technology Co., Ltd. for their strong support, which provided an important guarantee for the smooth development of this study.

Funding Statement: This study was supported in part by Beijing Natural Science Foundation under Grant L251058 and in part by Project of State Key Lab of Intelligent Transportation System under Grant 2024-A001.

Author Contributions: Conceptualization, Pangwei Wang and Jie Wang; Data curation, Hangrui Dong and Li Wang; Formal analysis, Jie Wang; Investigation, Zipeng Wang; Methodology, Pangwei Wang, Jie Wang and Zipeng Wang; Validation, Pangwei Wang and Li Wang; Visualization, Hangrui Dong; Writing—original draft, Jie Wang and Zipeng Wang. All authors reviewed and approved the final version of the manuscript.

Availability of Data and Materials: Data available on request from the authors.

Ethics Approval: Not applicable.

Conflicts of Interest: The authors declare no conflicts of interest.

References

1. Yan X, Chu D, Liu J, Jiang Z, He Y. The current state, challenges, and prospects of intelligent transportation development. Transp Res. 2021;7(6):2. (In Chinese). doi:10.16503/j.cnki.2095-9931.2021.06.001. [Google Scholar] [CrossRef]

2. Jafari M, Kavousi-Fard A, Chen T, Karimi M. A review on digital twin technology in smart grid, transportation system and smart city: challenges and future. IEEE Access. 2023;11:17471–84. doi:10.1109/ACCESS.2023.3241588. [Google Scholar] [CrossRef]

3. Liang Z, Shi B, Dong B, Wei H. Management actions enhanced traffic prediction under sparse data. IEEE Trans Intell Transp Syst. 2025;26(11):20503–18. doi:10.1109/TITS.2025.3588274. [Google Scholar] [CrossRef]

4. Weerakody PB, Wong KW, Wang G, Ela W. A review of irregular time series data handling with gated recurrent neural networks. Neurocomputing. 2021;441(7):161–78. doi:10.1016/j.neucom.2021.02.046. [Google Scholar] [CrossRef]

5. Du W, Côté D, Liu Y. SAITS: self-attention-based imputation for time series. Expert Syst Appl. 2023;219(10):119619. doi:10.1016/j.eswa.2023.119619. [Google Scholar] [CrossRef]

6. Chen J, Yang L, Yang Y, Peng L, Ge X. Spatio-temporal graph neural networks for missing data completion in traffic prediction. Int J Geogr Inf Sci. 2025;39(5):1057–75. doi:10.1080/13658816.2024.2381221. [Google Scholar] [CrossRef]

7. Qi X, Mei G, Tu J, Xi N, Piccialli F. A deep learning approach for long-term traffic flow prediction with multifactor fusion using spatiotemporal graph convolutional network. IEEE Trans Intell Transp Syst. 2023;24(8):8687–700. doi:10.1109/TITS.2022.3201879. [Google Scholar] [CrossRef]

8. Chen H, Huang J, Lu Y, Huang J. Urban traffic flow prediction based on multi-spatio-temporal feature fusion. Neurocomputing. 2025;638(4):130117. doi:10.1016/j.neucom.2025.130117. [Google Scholar] [CrossRef]

9. Chen X, Liang S, Zhang Z, Zhao F. A novel spatiotemporal data low-rank imputation approach for traffic sensor network. IEEE Internet Things J. 2022;9(20):20122–35. doi:10.1109/JIOT.2022.3172447. [Google Scholar] [CrossRef]

10. Dong H, Ding F, Tan H, Zhang H. Laplacian integration of graph convolutional network with tensor completion for traffic prediction with missing data in inter-city highway network. Phys A Stat Mech Appl. 2022;586(2):126474. doi:10.1016/j.physa.2021.126474. [Google Scholar] [CrossRef]

11. Chen Y, Chen X. A novel reinforced dynamic graph convolutional network model with data imputation for network-wide traffic flow prediction. Transp Res Part C Emerg Technol. 2022;143(3):103820. doi:10.1016/j.trc.2022.103820. [Google Scholar] [CrossRef]

12. Jiang J, Han C, Zhao WX, Wang J. PDFormer: propagation delay-aware dynamic long-range transformer for traffic flow prediction. Proc AAAI Conf Artif Intell. 2023;37(4):4365–73. doi:10.1609/aaai.v37i4.25556. [Google Scholar] [CrossRef]

13. Zhang Z, Yang H, Yang X. A transfer learning–based LSTM for traffic flow prediction with missing data. J Transp Eng Part A Syst. 2023;149(10):04023095. doi:10.1061/jtepbs.teeng-7638. [Google Scholar] [CrossRef]

14. Huang X, Jiang Y, Tang J. MAPredRNN: multi-attention predictive RNN for traffic flow prediction by dynamic spatio-temporal data fusion. Appl Intell. 2023;53(16):19372–83. doi:10.1007/s10489-023-04494-8. [Google Scholar] [CrossRef]

15. Meng X, Fu H, Peng L, Liu G, Yu Y, Wang Z, et al. D-LSTM: short-term road traffic speed prediction model based on GPS positioning data. IEEE Trans Intell Transp Syst. 2022;23(3):2021–30. doi:10.1109/TITS.2020.3030546. [Google Scholar] [CrossRef]

16. Zhang X, Zhao Y, Wang S, Sun Y, Yin B. A tensorial weighted Schatten-p norm model with neighbor regularization for traffic data completion and traffic system correlation exploration. Neurocomputing. 2023;559(7):126765. doi:10.1016/j.neucom.2023.126765. [Google Scholar] [CrossRef]

17. Liu S, Dai S, Sun J, Mao T, Zhao J, Zhang H. Multicomponent spatial-temporal graph attention convolution networks for traffic prediction with spatially sparse data. Comput Intell Neurosci. 2021;2021(1):9134942. doi:10.1155/2021/9134942. [Google Scholar] [CrossRef]

18. Nie T, Qin G, Wang Y, Sun J. Correlating sparse sensing for large-scale traffic speed estimation: a Laplacian-enhanced low-rank tensor Kriging approach. Transp Res Part C Emerg Technol. 2023;152(7):104190. doi:10.1016/j.trc.2023.104190. [Google Scholar] [CrossRef]

19. Wu X, Xu M, Fang J, Wu X. A multi-attention tensor completion network for spatiotemporal traffic data imputation. IEEE Internet Things J. 2022;9(20):20203–13. doi:10.1109/JIOT.2022.3171780. [Google Scholar] [CrossRef]

20. Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations (ICLR2017); 2017 Apr 24–26; Toulon, France. [Google Scholar]

21. Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP2014); 2014 Oct 25–29; Doha, Qatar. doi:10.3115/v1/d14-1179. [Google Scholar] [CrossRef]

22. Shi H, Wang P, Yang X, Yu H. An improved mean imputation clustering algorithm for incomplete data. Neural Process Lett. 2022;54(5):3537–50. doi:10.1007/s11063-020-10298-5. [Google Scholar] [CrossRef]

23. Xu Y, Zhuang F, Wang E, Li C, Wu J. Learning without missing-at-random prior propensity-a generative approach for recommender systems. IEEE Trans Knowl Data Eng. 2025;37(2):754–65. doi:10.1109/TKDE.2024.3490593. [Google Scholar] [CrossRef]

24. Zhang L, Liu Q, Yang W, Wei N, Dong D. An improved K-nearest neighbor model for short-term traffic flow prediction. In: Proceedings of the 13th COTA International Conference of Transportation Professionals (CICTP2013); 2013 Aug 13–16; Shenzhen, China. [Google Scholar]

25. Zhang W, Zhang P, Yu Y, Li X, Biancardo SA, Zhang J. Missing data repairs for traffic flow with self-attention generative adversarial imputation net. IEEE Trans Intell Transp Syst. 2022;23(7):7919–30. doi:10.1109/TITS.2021.3074564. [Google Scholar] [CrossRef]

26. Luo Q, He S, Han X, Wang Y, Li H. LSTTN: a Long-Short Term Transformer-based spatiotemporal neural network for traffic flow forecasting. Knowl Based Syst. 2024;293(2):111637. doi:10.1016/j.knosys.2024.111637. [Google Scholar] [CrossRef]

27. Liang Y, Zhao Z, Sun L. Memory-augmented dynamic graph convolution networks for traffic data imputation with diverse missing patterns. Transp Res Part C Emerg Technol. 2022;143(7):103826. doi:10.1016/j.trc.2022.103826. [Google Scholar] [CrossRef]

28. Shen G, Zhou W, Zhang W, Liu N, Liu Z, Kong X. Bidirectional spatial-temporal traffic data imputation via graph attention recurrent neural network. Neurocomputing. 2023;531(21):151–62. doi:10.1016/j.neucom.2023.02.017. [Google Scholar] [CrossRef]

29. Hu N, Zhang D, Xie K, Liang W, Diao C, Li KC. Multi-range bidirectional mask graph convolution based GRU networks for traffic prediction. J Syst Archit. 2022;133(5):102775. doi:10.1016/j.sysarc.2022.102775. [Google Scholar] [CrossRef]

30. Li MW, Hong WC, Kang HG. Urban traffic flow forecasting using Gauss-SVR with cat mapping, cloud model and PSO hybrid algorithm. Neurocomputing. 2013;99(36):230–40. doi:10.1016/j.neucom.2012.08.002. [Google Scholar] [CrossRef]

Cite This Article

APA Style

Wang, P., Wang, J., Wang, Z., Dong, H., Wang, L. (2026). Multi-Source Traffic Information Completion and Perception Method via Graph Convolutional Neural Networks in Intelligent Connected Transportation System. Computers, Materials & Continua, 88(2), 58. https://doi.org/10.32604/cmc.2026.080815

Vancouver Style

Wang P, Wang J, Wang Z, Dong H, Wang L. Multi-Source Traffic Information Completion and Perception Method via Graph Convolutional Neural Networks in Intelligent Connected Transportation System. Comput Mater Contin. 2026;88(2):58. https://doi.org/10.32604/cmc.2026.080815

IEEE Style

P. Wang, J. Wang, Z. Wang, H. Dong, and L. Wang, “Multi-Source Traffic Information Completion and Perception Method via Graph Convolutional Neural Networks in Intelligent Connected Transportation System,” Comput. Mater. Contin., vol. 88, no. 2, pp. 58, 2026. https://doi.org/10.32604/cmc.2026.080815

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Multi-Source Traffic Information Completion and Perception Method via Graph Convolutional Neural Networks in Intelligent Connected Transportation System

Abstract

Keywords

References

Cite This Article

496

188

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link