Open Access
ARTICLE
YOLOv10-HQGNN: A Hybrid Quantum Graph Learning Framework for Real-Time Faulty Insulator Detection
1 AI Lab, Faculty of Information Technology, Ho Chi Minh City Open University, 35-37 Ho Hao Hon Street, Co Giang Ward, District 1, Ho Chi Minh City, 700000, Vietnam
2 Department of Apply Science, Faculty of Science and Technology, Suan Sunandha Rajabhat University, 1 U Thong Nok Rd, Dusit, Dusit District, Bangkok, 10300, Thailand
* Corresponding Authors: Vinh Truong Hoang. Email: ; Kittikhun Meethongjan. Email:
(This article belongs to the Special Issue: Emerging Machine Learning Methods and Applications)
Computers, Materials & Continua 2026, 86(3), 75 https://doi.org/10.32604/cmc.2025.069587
Received 26 June 2025; Accepted 30 October 2025; Issue published 12 January 2026
Abstract
Ensuring the reliability of power transmission networks depends heavily on the early detection of faults in key components such as insulators, which serve both mechanical and electrical functions. Even a single defective insulator can lead to equipment breakdown, costly service interruptions, and increased maintenance demands. While unmanned aerial vehicles (UAVs) enable rapid and cost-effective collection of high-resolution imagery, accurate defect identification remains challenging due to cluttered backgrounds, variable lighting, and the diverse appearance of faults. To address these issues, we introduce a real-time inspection framework that integrates an enhanced YOLOv10 detector with a Hybrid Quantum-Enhanced Graph Neural Network (HQGNN). The YOLOv10 module, fine-tuned on domain-specific UAV datasets, improves detection precision, while the HQGNN ensures multi-object tracking and temporal consistency across video frames. This synergy enables reliable and efficient identification of faulty insulators under complex environmental conditions. Experimental results show that the proposed YOLOv10-HQGNN model surpasses existing methods across all metrics, achieving Recall of 0.85 and Average Precision (AP) of 0.83, with clear gains in both accuracy and throughput. These advancements support automated, proactive maintenance strategies that minimize downtime and contribute to a safer, smarter energy infrastructure.Keywords
Insulators are indispensable components of overhead power transmission systems, serving dual functions of electrical insulation and mechanical support. Modern insulators have evolved into disc-shaped designs, typically made of glass or ceramics, suspended from transmission towers. These designs enable long-distance electricity transmission while reducing line capacitance. However, prolonged exposure to intense electric fields and adverse environmental conditions leads to issues such as self-explosion, physical damage, and leakage current.
Physical damage and surface defects are the most common issues, often resulting from pollution flashovers. Accumulated pollutants on the insulator surface can form a conductive layer when exposed to moisture, compromising electrical properties. These defects contribute to 40%–60% of power grid failures, emphasizing the importance of timely detection to prevent outages and financial losses.
As shown in Fig. 1, Unmanned Aerial Vehicles (UAVs) have revolutionized power line inspections by replacing manual patrols with an efficient, systematic approach [1]. Equipped with high-resolution cameras, UAVs capture detailed imagery of insulators, enabling streamlined maintenance operations. However, accurate identification of insulators in UAV imagery is complicated by variations in insulator types, viewing angles, lighting conditions, occlusions, and background complexity.

Figure 1: Insulator inspection with UAVs
Recent advancements in deep learning have led to flexible object detection algorithms, significantly improving image classification and recognition tasks. Quantum Machine Learning (QML) [2] combines quantum computing with classical ML techniques, enabling models to operate in high-dimensional Hilbert spaces. This enhances feature extraction capabilities, improving accuracy and generalization. Quantum systems excel at optimization through quantum parallelism, enabling simultaneous evaluation of multiple solutions and achieving faster convergence during training.
This study presents an innovative approach for detecting faulty insulators by integrating YOLOv10 [3] with Graph Neural Networks (GNNs) [4], optimized using Quantum Machine Learning, forming a Hybrid Quantum-Enhanced Graph Neural Network (HQGNN). The proposed system leverages transfer learning to enhance feature extraction from UAV-captured images, effectively handling variations in lighting, occlusions, and complex backgrounds. It incorporates graph-based analysis to model spatial relationships between insulators and integrates quantum computing techniques to enhance computational efficiency and representation learning.
The system employs transfer learning with YOLOv10, fine-tuned on aerial datasets under diverse environmental conditions. GNNs model spatial and temporal relationships between objects, extracting features like proximity and interaction patterns. Quantum algorithms optimize the system by accelerating training and improving performance in handling high-dimensional UAV imagery. Experimental validation shows the system significantly enhances efficiency by automating inspections and improving accuracy in defect detection.
The remainder of this paper is organized as follows: Section 2 reviews related work, Section 3 details the methodology, Section 4 presents experimental results. Section 5 discusses findings and Section 6 concludes the study.
Maintaining reliable electricity supply depends on timely detection and repair of insulator defects. Traditional inspection methods, such as helicopter patrols and field assessments, are resource intensive and pose safety risks, making frequent inspections impractical.
UAVs have emerged as a revolutionary tool for power line inspections, offering enhanced efficiency and safety. However, UAV-based inspections introduce challenges in analyzing acquired imagery, including complex backgrounds, varying lighting, and accurately identifying small defects.
Deep learning algorithms (Table 1) have become the cornerstone of modern object detection systems. YOLOv7 [11] set new benchmarks in performance by excelling in both speed and accuracy. However, its application to insulator defect detection remains underexplored, particularly with intricate backgrounds and small-scale targets.
Neural network-based insulator-defect detection systems can be categorized into:
• Two-Stage Models: These separate region proposal and classification. Examples include R-CNNs [12], Fast R-CNNs [13], and Faster R-CNNs [14]. Lu et al. [15] optimized anchor box selection in Faster R-CNN, achieving high accuracy but slow processing speed.
• One-Stage Models: These integrate detection and classification, prioritizing speed. Examples include YOLO and SSD. Wu et al. [16] introduced CenterNet with attention mechanisms [17]. Feng et al. [18] achieved 86.8% accuracy using YOLOv5. Liu et al. [19] developed MTI-YOLO with multi-scale feature detection, focusing only on normal insulators.
Emerging paradigms such as Variational Quantum Algorithms (VQAs) [2] and Graph Neural Networks (GNNs) [4] present opportunities for revolutionizing power line inspections. GNNs [20–22] have demonstrated effectiveness in Multi-Object Tracking (MOT) [23], capturing spatial and temporal relationships between insulators.
To address these challenges, we propose an innovative framework integrating advanced object detection with GNNs enhanced by QNNs (HQGNNs):
1. Augmented Fine-Tuning: YOLOv10 is fine-tuned using aerial insulator datasets featuring diverse backgrounds and defect types.
2. GNN Integration: GNNs model relationships between insulators, capturing contextual relationships for superior accuracy.
3. Quantum Optimization: QML optimizes GNN performance, enabling efficient learning in scenarios involving small targets and complex factors.
As illustrated in Fig. 2, the proposed deep learning-based architecture consists of two primary phases: model training and real-time faulty insulator detection. This framework integrates high-resolution aerial imagery captured by Unmanned Aerial Vehicles (UAVs) during power line inspections with state-of-the-art object detection models and Hybrid Quantum-Enhanced Graph Neural Networks (HQGNNs) to achieve superior detection accuracy, scalability, and robustness. The training process begins with the collection of high-resolution aerial images from UAV inspections. These raw images undergo preprocessing to ensure they are optimized for training, which includes resizing for uniformity, cropping to focus on insulator regions, and labeling for supervised learning, where insulators are classified as either normal or faulty. The goal of preprocessing is to transform UAV imagery into a clean, structured dataset that improves model consistency and accuracy. Once preprocessing is complete, the training phase fine-tunes a YOLOv10-based object detection model [3] to accurately detect and classify insulators in challenging real-world conditions. The model initialization starts with pre-trained weights from the COCO dataset, which enables it to generalize well across various backgrounds. The model is then fine-tuned using a custom dataset containing diverse backgrounds, lighting variations, and defect types to enhance robustness and precision.

Figure 2: Proposed method
Once trained, the model is deployed for real-time faulty insulator detection. Unmanned Aerial Vehicles (UAVs) equipped with high-resolution cameras continuously capture aerial imagery and real-time video streams. These data are first transmitted via Radio Frequency (RF) signals to a ground-based receiver (UAV remote controller) and then relayed over RTMP (Real-Time Messaging Protocol) via Wi-Fi network to the main computer, where the trained deep learning model processes them for detection and analysis. The model automatically detects, classifies, and tracks insulators across diverse environmental conditions. Through dynamic frame-by-frame analysis, the system not only identifies the presence and type of insulators but also monitors their condition, detects potential faults, and continuously tracks their position as the UAV navigates along power lines. This integrated capability enables proactive inspection, reduces reliance on labor-intensive manual surveys, and supports timely, data-driven maintenance decisions, ultimately minimizing operational risks while lowering inspection costs.
The proposed system begins with an object detection module based on an augmented YOLOv10 architecture. Raw UAV imagery is first preprocessed to create a clean, structured dataset by reducing noise and improving feature visibility. The fine-tuned YOLOv10 model is then used to accurately classify insulators as either normal or faulty, even in visually complex scenes. A Re-Identification (Re-ID) module is integrated to assign consistent IDs across frames, maintaining tracking continuity despite occlusions, motion blur, or changes in perspective. To enhance tracking and relational reasoning, the system incorporates Hybrid Quantum-Enhanced Graph Neural Networks (HQGNNs). In this framework, insulators are modeled as graph nodes, while their spatial and temporal relationships are represented as edges. QGNNs dynamically update node and edge features to capture contextual dependencies, enabling robust object association and improved fault localization. This graph-based approach surpasses conventional Multi-Object Tracking (MOT) by leveraging spatial proximity, motion patterns, and structural cues for more accurate and resilient performance.
The integration of YOLOv10, GNNs, and quantum computing offers substantial advantages over traditional methods. It enables real-time detection and tracking with high accuracy, even under challenging environmental conditions. The modular architecture supports easy adaptation to new datasets and scenarios, while the hybrid learning approach improves generalization and robustness. By automating UAV-based inspections, the system enhances efficiency, reduces operational costs, and strengthens the reliability and sustainability of modern power grid monitoring.
To address dataset scarcity, the framework adopts a transfer learning strategy by leveraging models pre-trained on MS COCO [24]. Although COCO does not include imagery from the electricity domain, it provides rich representations of shapes, textures, and spatial patterns that are valuable for the extraction of generic visual features. Nonetheless, two key challenges arise:
1. Domain Gap: Generic visual features are not directly optimized for insulator-specific contexts.
2. Data Size Gap: Relatively small insulator datasets increase the risk of overfitting when compared to large-scale corpora such as COCO.
To overcome these challenges, a two-phase fine-tuning strategy is employed, as illustrated in Fig. 3:

Figure 3: Two-phase transfer learning and fine-tuning strategy
1. Basic Dataset Training: Adapts the model to identify insulators in various terrains and configurations of the power grid.
2. Specific Dataset Fine-Tuning: Refines the model’s ability to detect faulty insulators under challenging operational conditions, supported by data augmentation to improve robustness.
The methodology employs three hierarchically structured datasets: the basic insulator dataset, the specific insulator dataset and the faulty insulator dataset to progressively train and fine-tune the deep learning framework for UAV-based inspection. Each dataset is systematically curated and preprocessed to address the unique challenges of insulator detection and defect diagnosis in aerial imagery.
The basic dataset forms the foundation of the first fine-tuning stage. It comprises UAV-captured images from geographically diverse regions, spanning multiple voltage levels and installation scenarios. This dataset emphasizes broad variability through:
• Insulator Types: Porcelain, composite, and glass insulators deployed in different power systems.
• Background Complexity: Diverse environments such as forests, deserts, urban areas, and industrial sites.
• Contextual Richness: Co-occurrence of towers, conductors, and grounding wires, which helps differentiate insulators from visually similar structures.
The specific dataset supports the second fine-tuning stage and focuses on fault-related scenarios, such as surface cracks, contamination, and structural damage. It also accounts for difficult operating conditions, including dense vegetation and industrial interference. To avoid overfitting, augmentation strategies, such as horizontal and vertical flipping, random cropping, and rotation, are applied to increase orientation variability and highlight defect regions.
Unlike conventional pipelines that require complete retraining for each new defect type or inspection region, the proposed approach supports modular updates through targeted datasets. This not only reduces computational costs but also improves scalability. By combining large-scale pretraining with structured domain adaptation, the framework achieves accurate, robust, and efficient detection of insulators and their defects, contributing to safer and more reliable operation of modern power transmission systems.
Training incorporates multiple regularization techniques to enhance robustness. Early stopping prevents overtraining when validation accuracy plateaus, while L1/L2 regularization constrains parameter growth. Dropout mitigates over-reliance on narrow feature subsets, and batch normalization stabilizes feature distributions, thereby accelerating convergence. These measures collectively improve generalization across UAV inspection tasks.
Given the limited availability of domain-specific UAV datasets, augmentation is central to dataset expansion. Morphological transformations (rotations, translations, and brightness adjustments) replicate diverse flight paths, orientations, and lighting conditions. More advanced augmentation strategies further increase variability: Mosaic augmentation [25] integrates four images into one to improve contextual diversity, Mix-Up blends pixel-level interpolations to promote smoother decision boundaries, and color space transformations (hue, saturation, exposure) simulate dynamic environmental conditions. Collectively, these augmentations reduce overfitting and enhance model resilience.
Annotation quality plays a decisive role. As shown in Fig. 4, the LabelImg tool [26] is used to assign unique IDs, bounding boxes, and class labels (normal or faulty). Accurate annotations provide consistent supervision during training, directly impacting detection precision and evaluation reliability.

Figure 4: Annotation of insulator imagery
As shown in Fig. 2, the real-time detection phase consists of two main components: the object detection model, which is enhanced by the tuning-up dataset and implemented on the object detector, and the Hybrid Quantum-Enhanced Graph Neural Network (HQGNN), which handles data association, aggregation, and classification.
YOLOv10 (You Only Look Once), proposed by Wang et al. (2024), has become a hallmark of innovation in the field of object detection. This method uses a unified framework that performs both object detection and classification in a single pass through the network. To achieve this, the input image is divided into a grid, with each grid cell responsible for predicting bounding boxes and object class probabilities. This approach minimizes computational overhead, significantly boosting the potential for real-time detection, which is crucial for applications such as insulator detection where speed and accuracy are paramount.
The YOLO architecture has undergone multiple iterations, beginning with version v1 in 2016 and progressing to v5 in 2020, each version bringing substantial improvements in both speed and accuracy. YOLOv10 introduces several novel mechanisms designed to further enhance its efficiency and precision in object detection. Key innovations in YOLOv10 include Non-Maximum Suppression (NMS)-free training, dual-label assignments, and consistent matching metrics. The matching process in YOLOv10 is governed by a consistent metric that enables the system to make prediction-based assignments during the detection phase. The uniform metric applied to the matching strategy is expressed as follows:
where
Furthermore, YOLOv10 adopts a composite loss function that integrates three main components: (i) the classification loss (
where
YOLOv10 represents a significant leap forward in object detection technology, with its innovative approach to handling label assignments and eliminating traditional computational bottlenecks. The suitability of the model for real-time applications, such as insulator detection, underscores its practical utility in industrial settings where both speed and precision are critical. As YOLO continues to evolve, it is poised to remain at the forefront of advances in computer vision, driving further improvements in detection accuracy and operational efficiency across a variety of domains.
3.2.2 Hybrid Quantum-Enhanced Graph Neural Network (HQGNN)
The proposed Hybrid Quantum-Enhanced Graph Neural Network (HQGNN) merges the strengths of quantum computing with the established capabilities of graph neural networks (GNNs) [4] to enhance graph-based learning tasks by leveraging quantum mechanics. This hybrid architecture incorporates quantum circuits to capture complex, high-dimensional relationships between nodes in a graph, while classical GNNs are responsible for feature extraction and aggregation. By combining these quantum and classical components, a unified loss function optimizes both parts of the model, ultimately improving overall performance.
In this hybrid model, the GNN plays an essential role in modeling the relationships between objects of varying scales across both spatial and temporal domains. The GNN constructs a graph, where nodes represent objects or entities, and edges denote the relationships between them. It then iteratively updates the node features via a process known as node feature aggregation. This process is crucial for the network to capture complex interactions between objects, enabling accurate spatial-temporal modeling. Such capabilities are particularly beneficial in scenarios involving dynamic or time-evolving graph structures, such as tracking objects over time or modeling interactions in a network of nodes with evolving relationships.
The integration of quantum components in HQGNNs provides a significant advantage, enabling the network to model even more complex interactions than classical GNNs alone. Quantum circuits excel at representing superpositions and entanglements, allowing them to explore the relationship between nodes at a higher level of complexity. By applying quantum gates to encode and process information in superposition states, the quantum layer of HQGNNs can discover intricate patterns that classical GNNs might struggle to identify. This capability is especially useful in high-dimensional or non-linear data structures where quantum mechanics can uncover deeper relationships between objects. Typically, the quantum layer is tasked with capturing intricate, high-order interactions between nodes and can also interfere with the final classification process, enhancing the model’s ability to make more accurate predictions.
As depicted in Fig. 5, the Hybrid Quantum-Enhanced Graph Neural Network (HQGNN) architecture is composed of multiple stages, each tailored to leverage the complementary strengths of classical graph neural networks (GNNs) and quantum computing. By embedding quantum circuits into the GNN pipeline for feature extraction and aggregation, HQGNN significantly improves the modeling of complex dependencies in graph-structured data. The overall workflow is summarized in Algorithm 1, with the key stages of the architecture outlined as follows:

Figure 5: HQGNN network architect
Feature Extraction. At each time frame
To effectively capture temporal dependencies between consecutive frames, we employ the ROIAlign [27] operation on both the reference region
Node Feature Association. After feature extraction, the data is transformed into a graph structure, where nodes represent individual objects (tracklets and new detections), and edges encode relationships or interactions, ensuring a structured and efficient representation of dependencies. This transformation is particularly crucial for multi-frame object tracking, as it allows the model to leverage spatial-temporal relationships that traditional deep learning methods might overlook. Nodes are initialized with features extracted from raw data, while edges are established based on similarity metrics, spatial proximity, or predefined connectivity rules, enabling the model to capture complex interdependencies and maintain temporal coherence across frames.

At this stage, the initial dataset consists of tracklets from the previous frame and new detections from the current frame, represented as:
•
•
where
Once the data is mapped into a graph, nodes represent tracklets and new detections from
To ensure precise object association, the system employs Image Alignment Using Homography [28] to enable the alignment of images across frames by mapping corresponding points between frames, compensating for changes in viewpoint, camera motion, and perspective distortion. By aligning aerial images at
Homography-based image alignment involves estimating a transformation matrix H that maps points from one image to another, ensuring spatial consistency across frames. The loss function for this alignment is crucial for minimizing errors in transformation and ensuring accurate object matching. The objective is to refine H by minimizing the discrepancy between corresponding points in the aligned images.
A common loss function for homography estimation is Reprojection Loss, which measures the error between the projected points (transformed using H) and their actual positions in the target image:
where
where
This loss ensures that texture, lighting, and shading are preserved after alignment. Instead of directly comparing pixel intensities, Structural Similarity Index Measure (SSIM) measures the perceptual difference between images, accounting for contrast and structural similarity:
where
where
Node Feature Aggregation and Classification. Once the graph representation is constructed, it is processed by the Graph Neural Network (GNN) to refine and enhance node features. The GNN applies iterative message passing, where each node updates its representation by aggregating features from its neighbors. This mechanism enables the model to capture both local and global dependencies, effectively learning complex spatial and temporal interactions between objects.
Through three layers of graph convolutions (Fig. 5), the GNN progressively enriches the node embeddings, making them increasingly discriminative for subsequent tasks. This module updates node features using information propagated from neighboring nodes within the graph. By leveraging this neighborhood information, spatial-temporal relationships between objects are effectively modeled. The process of updating node features is computed using the following equation:
where
Following this stage, the refined embeddings are passed into a quantum-inspired classification module, where quantum feature mappings project the embeddings into a higher-dimensional Hilbert space. With 5 Qubits, the quantum computation plays a crucial role in feature classification, leveraging quantum entanglement and superposition to enhance pattern recognition, while classical GNNs handle message passing and node feature aggregation, ensuring effective learning across spatial and temporal dimensions.
The quantum circuit [29] in HQGNN begins by applying a Hadamard gate to each qubit, placing them in a superposition state to enable parallel computation and enhance feature representation. It then applies RX rotations, where each qubit undergoes a controlled rotation around the X-axis, encoding classical information into the quantum state while preserving quantum coherence. Finally, the circuit performs Pauli-Z measurements, extracting meaningful quantum features that contribute to classification or learning tasks. By integrating these quantum operations with classical GNNs, HQGNN enhances computational efficiency and learning performance, offering a powerful approach to complex graph-based problems. The hybrid loss function is defined as:
where
Finally, a softmax activation layer is applied to the classifier outputs, converting the scores into normalized probabilities across target categories. This ensures that the model not only identifies the most likely class but also provides interpretable confidence levels, making the pipeline suitable for robust decision making in complex environments.
Training and testing were conducted under system specifications in Table 2. As shown in Fig. 6, the DJI Matrice 300 RTK [30] transmitted aerial imagery to the main computer via RTMP server using MonaServer2 [31]. Hardware consisted of AMD EPYC 7401P CPU, 128 GB RAM, and NVIDIA RTX 3060 GPU. Software implementation used Python v3.10, CUDA v11.7, PyTorch v2, and PennyLane v1.30 with Amazon Braket [32] for quantum computation. Training was conducted with batch size 128, learning rate 0.01, and 512


Figure 6: Experimental architecture
All insulator images were sourced from the Insulator Defect Image Dataset (IDID) [33], captured by UAVs during routine inspections. The dataset includes four defect subclasses: mechanical, electrical, environmental, and deterioration defects, as shown in Table 3.

Performance evaluation used precision, recall, F1 score, mean average precision (mAP), and multi-object tracking (MOT) metrics. Five-fold cross-validation was performed to ensure reliability and validate generalizability.
Object detection metrics include precision, recall, and F1 score:
Average Precision (AP) summarizes the precision-recall curve. Multi-Object Tracking uses CLEAR MOT metrics [34]:
Identity association metrics measure tracking consistency:
As shown in Table 4, the YOLOv10-HQGNN model surpasses all the approaches compared in almost all evaluation metrics, with significant gains in Recall (0.85) and Average Precision (AP) at 0.83. The integration of the Hybrid Quantum Graph Neural Network (HQGNN) provides clear improvements over existing models [16,35–37], demonstrating its effectiveness in enhancing feature extraction and classification. This performance advantage is due to the combined strengths of YOLOv10’s detection capabilities and the ability of HQGNN to capture cross-object relationships. Consequently, instances that YOLOv10 alone cannot identify as faulty insulators can still be correctly recognized through the relational linking mechanism of the proposed model.
A notable observation is that the proposed model exhibits higher recall than precision. This discrepancy arises primarily from the complex and cluttered backgrounds of the data set, which challenge the model’s components. This visual complexity increases the likelihood of false positives (FP), where background elements are mistakenly classified as insulators. An increase in FP directly reduces precision, whereas recall remains unaffected, as it measures only the proportion of actual positives correctly identified.
The proposed YOLOv10-HQGNN model, as illustrated in Fig. 7, exhibits outstanding training and detection performance, surpassing other evaluated models in almost both accuracy and robustness. However, as shown in Fig. 7b, due to the integration of quantum-enhanced training on Amazon Braket [32], the YOLOv10-HQGNN model requires more training time compared to purely classical approaches. This additional overhead reflects a trade-off: longer training durations are exchanged for superior detection performance, improved robustness, and the ability to take advantage of quantum-classical hybrid advantages. Fig. 8 presents the experimental results, highlighting the model’s ability to identify and classify insulator defects within complex environmental conditions. These results emphasize the dynamic capabilities of the YOLOv10-HQGNN model in diverse and complex environments, showcasing its robustness and versatility in real-world applications. The model’s ability to perform well in both detection and tracking tasks, even amidst challenging backgrounds, is particularly important for applications such as monitoring power transmission lines, which are often surrounded by dense vegetation and other obstructions.

Figure 7: Model performance comparison

Figure 8: Detect insulators in complex backgrounds
4.4.2 Multi-Object Tracking Performance
The tracking evaluation in Table 5 demonstrates the effectiveness of the proposed YOLOv10-HQGNN model across multiple MOT metrics. The model achieves strong alignment with ground truth, reflected in a MOTP of 0.81 and a MOTA of 0.82, indicating accurate localization and reduced overall tracking errors. Identity consistency is also maintained, as shown by IDF1 = 0.82, supported by balanced identity precision (IDP = 0.81) and identity recall (IDR = 0.83). Importantly, the system exhibits very few identity switches (IDSW = 0.02) and low fragmentation (Frag = 0.06), suggesting robust track continuity even in challenging UAV inspection sequences. Furthermore, track coverage analysis shows that 74% of ground-truth trajectories are mostly tracked (MT), while only 5% are mostly lost (ML), underscoring the stability of the framework under cluttered and dynamic backgrounds.

Table 6 compares the inference speeds of YOLO10, the methods [16,35–37], and the proposed model on an RTX 3060 GPU using FP32 (single precision) and FP16 (half precision) formats. All models were evaluated on the same hardware and with a unified codebase to ensure consistency and eliminate implementation-related discrepancies.

The results demonstrate that the proposed model substantially outperforms methods [16,35–37] in inference speed, thereby offering improved efficiency. Although it is slower than YOLOv10, which has been explicitly optimized for real-time object detection, the proposed framework achieves a more favorable balance between inference throughput and detection precision.
This performance profile underscores an important trade-off for real-world deployment. Although ultrafast detectors such as YOLOv10 are advantageous in latency-critical contexts, such as mobile robotics, autonomous navigation, or real-time surveillance, these gains often come at the expense of model expressiveness and detection reliability in complex environments. In contrast, the proposed YOLOv10-HQGNN introduces a modest increase in computational cost relative to the fastest baselines but provides significant gains in detection accuracy and robustness. Such characteristics are particularly valuable in the safety and reliability critical domains, including UAV-based infrastructure inspections, and industrial monitoring, where erroneous predictions can translate into substantial operational risks.
4.4.4 Transmission Performance
Beyond assessing the predictive performance of the proposed model, it is equally important to account for transmission delays occurring along the image and video streaming pipeline, which span both the UAV-controller link and the subsequent controller-computer connection (Fig. 6). In this study, the DJI Matrice 300 RTK [30], operating with the OcuSync Enterprise protocol, served as the UAV platform. This RF-based communication channel exhibits variable latency influenced by flight distance, interference, and signal strength, as summarized in Table 7. In contrast, the controller-computer stage demonstrates greater stability. Transmission through RTMP over LAN Wi-Fi (5 GHz) typically introduces only 50–80 ms of additional delay within a 10 m range, provided sufficient bandwidth and minimal packet loss. This predictable latency ensures that although UAV-controller communication may fluctuate under operational conditions, the subsequent streaming stage remains consistent, supporting reliable real-time data processing.

Taken together, the end-to-end latency of approximately 300 ms (0.3 s) falls within thresholds widely regarded as sufficient for real-time UAV applications such as monitoring, surveillance, and inspection, where situational awareness rather than instantaneous control feedback is the primary operational requirement. Consequently, the proposed YOLOv10-HQGNN framework achieves a pragmatic balance delivering both the computational speed required for operational feasibility and the accuracy and robustness essential for domain-specific dependable fault detection.
As shown in Table 8, the ablation analysis highlights the incremental contributions of the GNN and quantum integration to the YOLOv10 framework. The baseline YOLOv10 achieves solid detection performance (AP = 0.81, AR = 0.82, F1 = 0.81) with the lowest inference time (21.5 ms in FP16), but it lacks multi-object tracking metrics since it operates as a standalone detector. Incorporating the GNN module enhances both detection and tracking capabilities, yielding higher accuracy (AP = 0.82, AR = 0.83, F1 = 0.82) and enabling stable tracking performance (MOTP = 0.81, MOTA = 0.81, IDF1 = 0.81). Building upon this, the proposed YOLOv10-HQGNN, which integrates a quantum-enhanced GNN, achieves the best overall performance (AP = 0.83, AP50 = 0.87, AP75 = 0.83, AR = 0.85, F1 = 0.84), along with improved tracking metrics (MOTP = 0.81, MOTA = 0.82, IDF1 = 0.82). Although the integration of the quantum module introduces additional computational overhead (115.4 G FLOPs and 60.3/42.9 ms in FP32/FP16), it delivers consistent improvements across both detection and tracking tasks, highlighting a clear trade-off between inference speed and overall robustness.

As depicted in Fig. 9, the training behavior and convergence trends of the three evaluated models, YOLOv10, YOLOv10-GNN, and YOLOv10-HQGNN, are compared over 500 epochs. Fig. 9a presents the progression of the training time, while Fig. 9b illustrates the corresponding loss convergence curves. The YOLOv10-HQGNN model exhibits the longest training time due to the additional computational complexity introduced by the quantum-enhanced GNN layer, followed by YOLOv10-GNN and the baseline YOLOv10. Despite the increased computational overhead, all models demonstrate stable and smooth convergence, with loss values steadily decreasing as training progresses. In particular, YOLOv10-HQGNN achieves a slightly faster loss reduction during the early epochs, indicating improved optimization efficiency and enhanced representation learning contributed by the quantum-assisted embedding. These results confirm that, while the hybrid quantum model incurs a higher computational cost, it offers better convergence dynamics and generalization stability compared to its classical counterparts.

Figure 9: Ablation performance
Table 9 presents the precision of the three ablation models: YOLOv10, YOLOv10-GNN, and YOLOv10-HQGNN, in four types of defects and three environmental conditions. In general, precision improves consistently as additional modules are integrated. The baseline YOLOv10 achieves solid precision values ranging from 0.77 to 0.82 across all environments, demonstrating reliable detection capability under varying conditions. Incorporating the GNN module further improves precision by approximately 0.02 to 0.03, reflecting better contextual reasoning among the insulator components. The quantum-enhanced YOLOv10-HQGNN achieves the highest precision, reaching up to 0.85 in both desert and urban environments. These results indicate that quantum integration strengthens feature representation and decision confidence, particularly in complex backgrounds such as urban areas, while maintaining robust performance across all defect categories. Overall, these results demonstrate that while YOLOv10 ensures efficiency, the inclusion of GNN and quantum modules substantially enhances contextual reasoning and tracking reliability, making the proposed model particularly suitable for real-time industrial inspection tasks where both accuracy and stability are critical.

The integration of YOLOv10, GNNs, and QML marks a transformative step in faulty insulator detection. This hybrid framework leverages each technology’s strengths: YOLOv10 for real-time detection, GNNs for spatial-contextual relationships, and QML for improved learning efficiency.
Challenges remain including high computational demands, need for extensive training data, and limited GNN interpretability. Environmental variables emphasize domain adaptation importance. UAV-based inspection poses regulatory and ethical challenges requiring compliance with aviation authorities and data protection regulations. Anonymization techniques such as blurring background structures and encrypting feeds should be incorporated to ensure adherence to privacy mandates.
Scalability demands efficient data partitioning, distributed computing, and model compression strategies. GNN architectures such as GraphSAGE, GAT, or LGNN [38] are particularly suited for distributed computation. Future research should prioritize optimized GNN architectures, hardware acceleration, and distributed edge learning frameworks. Advanced augmentation and transfer learning [39] strengthen robustness. Explainable AI (XAI) methods [40] ensure transparency and trust. Federated learning enables collaborative model improvement without transmitting sensitive data.
The integration of YOLOv10, GNNs, and QML offers a transformative framework for faulty insulator detection. YOLOv10 ensures accurate real-time detection, GNNs provide contextual reasoning, and QML accelerates training and inference. Together, these technologies enable automated inspections reducing operational downtime.
Challenges remain regarding computational demands, generalization, and interpretability. Future research should focus on scalable architectures, hardware acceleration, domain adaptation, and explainable AI integration. This hybrid YOLOv10-HQGNN framework establishes a strong foundation for next-generation UAV-assisted power grid inspection, advancing safer, smarter energy infrastructure.
Acknowledgement: This work is jointly supported by Ho Chi Minh City Open University, Vietnam and Suan Sunandha Rajabhat Univeristy, Thailand.
Funding Statement: Not applicable.
Author Contributions: The authors confirm their contributions to this work as follows: Conceptualization, methodology, data collection, analysis, and original draft preparation: Nghia Dinh. Critical review and editing by Viet-Tuan Le, Kiet Tran-Trung, Bay Nguyen Van, Vinh Truong Hoang and Kittikhun Meethongjan. Additional support for data processing and validation: Ha Duong Thi Hong, Hau Nguyen Trung and Thien Ho Huong. All authors reviewed the results and approved the final version of the manuscript.
Availability of Data and Materials: The data that support the findings of this study are available from the corresponding authors, upon reasonable request.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest to report regarding the present study.
References
1. Ahmed F, Mohanta J, Keshari A, Yadav PS. Recent advances in unmanned aerial vehicles: a review. Arab J Sci Eng. 2022;47(7):7963–84. doi:10.1007/s13369-022-06738-0. [Google Scholar] [PubMed] [CrossRef]
2. Schetakis N, Aghamalyan D, Griffin P, Boguslavsky M. Review of some existing QML frameworks and novel hybrid classical-quantum neural networks realizing binary classification for noisy datasets. Sci Rep. 2022;12(1):11927. doi:10.1038/s41598-022-14876-6. [Google Scholar] [PubMed] [CrossRef]
3. Wang A, Chen H, Liu L, Chen K, Lin Z, Han J, et al. YOLOv10: real-time end-to-end object detection. arXiv:2405.14458. 2024. [Google Scholar]
4. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The Graph Neural Network Model. IEEE Transactions on Neural Networks. 2009;20(1):61–80. doi:10.1109/tnn.2008.2005605. [Google Scholar] [PubMed] [CrossRef]
5. Belagoune S, Bali N, Bakdi A, Baadji B, Atif K. Deep learning through LSTM classification and regression for transmission line fault detection, diagnosis, and location in large-scale multi-machine power systems. Measurement. 2021;177(3):109330. doi:10.1016/j.measurement.2021.109330. [Google Scholar] [CrossRef]
6. Zhang X, Zhang Y, Liu J, Zhang C, Xue X, Zhang H, et al. InsuDet: a fault detection method for insulators of overhead transmission lines using convolutional neural networks. IEEE Trans Instrum Meas. 2021;70:5018512–12. doi:10.1109/tim.2021.3120796. [Google Scholar] [CrossRef]
7. Yeh CT, Thanh PN, Cho MY. Real-time leakage current classification of 15 kV and 25 kV distribution insulators based on bidirectional long short-term memory networks with deep learning. IEEE Access. 2022;10(3):7128–40. doi:10.1109/access.2022.3140479. [Google Scholar] [CrossRef]
8. Hao Y, Liang W, Wang X, Zhang W, Huang L, Yang L, et al. Automatic calculation of graphic area change rate for icing overhead power line insulators based on bounding box automatic matching and grabcut contour automatic segmentation. IEEE Trans Power Deliv. 2023;38(4):2821–30. doi:10.1109/tpwrd.2023.3262807. [Google Scholar] [CrossRef]
9. Wong SY, Choe CWC, Goh HH, Low YW, Cheah DYS, Pang C. Power transmission line fault detection and diagnosis based on artificial intelligence approach and its development in UAV: a review. Arab Jr Sci Eng. 2021;46(10):9305–31. doi:10.1007/s13369-021-05522-w. [Google Scholar] [CrossRef]
10. Wang Y, Gao Q, Li D, Liu J, Wang H, Yu X, et al. Insulator anomaly detection method based on few-shot learning. IEEE Access. 2021;9:194870–980. doi:10.1109/access.2021.3071305. [Google Scholar] [CrossRef]
11. Wang Y, Bochkovskiy A, Liao Y. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv:2207.02696. 2022. [Google Scholar]
12. Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2014 Jun 23–28; Columbus, OH, USA. p. 580–7. [Google Scholar]
13. Girshick R. Fast R-CNN. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV); 2015 Dec 7–13; Santiago, Chile. p. 1440–8. [Google Scholar]
14. Ren S, He K, Girshick R. Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497. 2015. [Google Scholar]
15. Lu W, Zhou Z, Ruan X, Yan Z, Cui G. Insulator detection method based on improved faster R-CNN with aerial images. In: 2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC); 2021 Aug 6–8; Nanjing, China. p. 417–20. [Google Scholar]
16. Wu C, Ma X, Kong X, Zhu H. Research on insulator defect detection algorithm of transmission line based on centerNet. PLoS One. 2021;16(7):e0255135. doi:10.1371/journal.pone.0255135. [Google Scholar] [PubMed] [CrossRef]
17. Nuanmeesri S. Enhanced hybrid attention deep learning for avocado ripeness classification on resource constrained devices. Sci Rep. 2025;15(1):3719. doi:10.1038/s41598-025-87173-7. [Google Scholar] [PubMed] [CrossRef]
18. Feng Z, Guo L, Huang D, Li R. Electrical insulator defects detection method based on YOLOv5. In: 2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS); 2021 May 14–16; Suzhou, China. p. 979–84. [Google Scholar]
19. Liu C, Wu Y, Liu J, Han J. MTI-YOLO: a light-weight and real-time deep neural network for insulator detection in complex aerial images. Energies. 2021;14(5):1426. doi:10.3390/en14051426. [Google Scholar] [CrossRef]
20. Morris C, Ritzert M, Fey M, Hamilton WL, Lenssen JE, Rattan G, et al. Weisfeiler and leman go neural: higher-order graph neural networks. arXiv:1810.02244. 2019. [Google Scholar]
21. Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907. 2017. [Google Scholar]
22. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. arXiv:1710.10903. 2018. [Google Scholar]
23. Weng X, Yuan Y, Kitani K. PTP: parallelized tracking and prediction with graph neural networks and diversity sampling. IEEE Robot Autom Lett. 2021;6(3):4640–7. doi:10.1109/lra.2021.3068925. [Google Scholar] [CrossRef]
24. MS COCO Dataset. MS COCO Dataset [Online]. [cited 2025 Aug 23]. Available from: https://paperswithcode.com/dataset/coco. [Google Scholar]
25. Tong Y, Luo X, Ma L, Xie S, Yang W, Guo Y. Saliency information and mosaic-based data augmentation method for densely occluded object recognition. Pattern Anal Applic. 2024;27(2):34. doi:10.1007/s10044-024-01258-z. [Google Scholar] [CrossRef]
26. LabelImg. LabelImg. [cited 2025 Aug 23]. Available from: https://github.com/tzutalin/labelImg. [Google Scholar]
27. He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV); 2017 Oct 22–29; Venice, Italy. p. 2961–9. [Google Scholar]
28. Nguyen T, Chen SW, Shivakumar SS, Taylor CJ, Kumar V. Unsupervised deep homography: a fast and robust homography estimation model. arXiv:1709.03966. 2017. [Google Scholar]
29. Karuppasamy K, Puram V, Johnson S, Thomas JP. A comprehensive review of quantum circuit optimization: current trends and future directions. Quantum Rep. 2025;7(1):2. doi:10.3390/quantum7010002. [Google Scholar] [CrossRef]
30. Knisely T. Top 9 features of the matrice 300 RTK [Internet]. 2020 [cited 2025 Feb 13]. Available from: https://enterprise-insights.dji.com/blog/matrice-300-top-9-features. [Google Scholar]
31. MonaSolutions. MonaServer2. Open-source lightweight web and media server supporting RTMP(ERTMFP, SRT, WebSocket, WebRTC, HLS, and more. 2025 [cited 2025 Feb 13]. Available from: https://github.com/MonaSolutions/MonaServer2. [Google Scholar]
32. Amazon Web Services. Amazon Braket [Internet]. 2025 [cited 2025 Mar 24]. Available from: https://aws.amazon.com/braket/. [Google Scholar]
33. Lewis D, Kulkarni P. Insulator defect detection. IEEE Dataport. 2021. doi:10.21227/vkdw-x769. [Google Scholar] [CrossRef]
34. Bernardin K, Stiefelhagen R. Evaluating multiple object tracking performance: the CLEAR MOT metrics. J Image Video Process. 2008;2008:246309. [Google Scholar]
35. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. End-to-end object detection with transformers (DETR). arXiv:2005.12872. 2020. [Google Scholar]
36. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, et al. Swin Transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV); 2021 Oct 10–17; Montreal, QC, Canada. p. 9992–10002. [Google Scholar]
37. Zhang X, Wang Z, Xu H, Zhao L, Wei Y, Zhao J, et al. HSPAN-GNN-based fault detection for power transmission lines. EURASIP J Adv Signal Process. 2025;2025:43. doi:10.1186/s13634-025-01251-6. [Google Scholar] [CrossRef]
38. Khemani B, Patil S, Kotecha K, Tanwar S. A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions. J Big Data. 2024;11(1):18. doi:10.1186/s40537-023-00876-4. [Google Scholar] [CrossRef]
39. Iman M, Arabnia HR, Rasheed K. A review of deep transfer learning and recent advancements. Technologies. 2023;11(2):40. doi:10.3390/technologies11020040. [Google Scholar] [CrossRef]
40. Yang W, Wei Y, Wei H, Chen Y, Huang G, Li X, et al. Survey on explainable AI: from approaches, limitations, and applications aspects. Hum-Cent Intell Syst. 2023;3(3):161–88. doi:10.1007/s44230-023-00038-y. [Google Scholar] [CrossRef]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF

Downloads
Citation Tools