Open Access
ARTICLE
TopoEKF: From State-Space Estimation to Topological Signatures for Enhanced Multi-Object Tracking and Anomaly Detection in UAVs
1 Department of Computer Engineering, Necmettin Erbakan University, Konya, Türkiye
2 Department of Mathematics and Computer Science, Necmettin Erbakan University, Konya, Türkiye
* Corresponding Author: Alperen Eroğlu. Email:
(This article belongs to the Special Issue: Innovative Applications of Fractional Modeling and AI for Real-World Problems)
Computer Modeling in Engineering & Sciences 2026, 147(3), 31 https://doi.org/10.32604/cmes.2026.081411
Received 02 March 2026; Accepted 27 May 2026; Issue published 30 June 2026
Abstract
Reliable multi-object detection and tracking play a critical role in Unmanned Aerial Vehicles-based aerial surveillance applications operating under challenging real-world conditions. This study presents a mathematically grounded, model-driven tracking framework named TopoEKF, which integrates an enhanced Adaptive Extended Kalman Filter with Topological Data Analysis to improve both tracking robustness and anomaly detection performance. Unlike prior approaches that primarily focus on refining object detection architectures, this work emphasizes the predictive power of iterative Bayesian filtering, optimal state estimation, and adaptive error minimization within a unified mathematical framework. The proposed system employs a carefully optimized YOLOv12 detector to provide accurate object location priors, followed by a formally defined discrete-time linear Gaussian tracking model. The Adaptive EKF is leveraged to handle nonlinearities arising from the projection of three-dimensional object motion onto the two-dimensional image plane through local linearization. To further enhance robustness under low resolution, large object-to-image distances, frequent occlusions, and environmental noise, TopoEKF introduces adaptive noise covariance modeling driven by measurement confidence, occlusion status, and topological feedback. Persistent homology is applied to EKF-filtered trajectories to extract topological signatures that characterize the global structure of object motion. These features are transformed into fixed-dimensional representations and processed by an unsupervised Isolation Forest classifier for trajectory-level anomaly detection. Experimental evaluations are conducted on a challenging hybrid dataset combining scenarios from COCO, VisDrone, UAVDT, Road_Anomaly_Dataset, and DoTA benchmarks. Quantitative results demonstrate that TopoEKF improves Multi-Object Tracking Accuracy fromKeywords
Unmanned Aerial Vehicles (UAVs) have fundamentally changed airborne surveillance and monitoring capabilities in a wide range of applications such as traffic control, precision agriculture, infrastructure inspection, search and rescue, and military reconnaissance. Multi-Object Tracking (MOT) from an airborne platform presents a series of significantly more severe challenges than those encountered with ground-based and fixed camera systems [1,2]. Platform instability and vibrations, which introduce unpredictable noise into measurements, are among the primary challenges UAV surveillance systems must overcome. Moreover, dynamic imaging geometry caused by rapid changes in altitude, pitch, and yaw leads to significant scale shifts and perspective distortion. Other problems, such as low resolution at high operating altitudes, the presence of small objects, frequent and prolonged occlusions from structures or terrain, environmental noise, and adverse weather conditions, directly affect and sometimes even degrade image quality and detection reliability.
End-to-end deep learning (DL) architectures are widely used as contemporary approaches for multiple object detection and tracking [3]. While these methods have achieved remarkable accuracy on standard benchmark datasets, their use in real-world, resource-constrained UAV environments is often limited. These limitations stem from high computational demands that are incompatible with the typical power and thermal envelopes of embedded systems; unacceptable interpretability for safety-critical applications; difficulty in parameter tuning, requiring extensive retraining for small changes in operational scenarios; and vulnerability to deployment changes in environmental conditions.
When dynamic and complex data is collected by unmanned aerial vehicles in multi-object tracking applications, state estimation has traditionally relied on filtering algorithms such as the Extended Kalman Filter (EKF) [4]. EKF is a widely used computationally efficient method for estimating the state vector in nonlinear systems [5]. Although EKF is effectively used for simple motion models, it faces limitations under complex motion patterns, rapid viewpoint changes, and the strict constraints of UAV scenarios. EKF as a basic nonlinear filter can suffer from drift and error issues due to subjectively adjusted damping factors. Furthermore, traditional methods such as EKF often focus on individual pixels or local state parameters. This makes EKF-based approaches inadequate for higher-level challenges such as identity change in multiple object tracking or complex anomaly detection based on the overall structural integrity of trajectories. These challenges cause subtle and critical topological gaps in object motion models, which traditional filtering approaches may overlook [6,7]. Therefore, despite the robustness of classical EKF in state estimation, it lacks the ability to capture topological signatures for trajectory-based anomaly detection and complex multi-object management [8].
Topological Data Analysis (TDA) and its core tool, Persistent Homology (PH), have emerged as powerful methodologies to overcome these shortcomings. It allows making an analysis of the complex and high-dimensional datasets’ internal shape and structure [9]. In recent years, TDA has emerged as a rapidly growing field and uses topology and geometry to extract robust, qualitative, and sometimes quantitative information about the structure of data [10]. PH is gaining increasing attention for detecting structural anomalies in synthetic and real-world datasets. PH has the ability to capture the topological properties, such as loops and holes, at multiple scales of the dataset [11]. Using the application of TDA is a current research trend in time series analysis and anomaly detection. TDA provides global information complementary to the information captured by other traditional approaches such as spectral detectors. This increases the reliability and explainability of decision-making processes, especially in critical applications such as intelligent transportation systems, cybersecurity, and biomedical fields [11]. The ability of persistent homology to reveal complex patterns and relationships that traditional methods often miss makes it an important resource in modern data science.
This research is motivated by the following fundamental research questions:
• How can we leverage the mathematical robustness and efficiency of classical state estimation techniques to achieve deep learning-comparable performance while simultaneously ensuring superior interpretability and efficiency on embedded UAV hardware?
• How can persistent homology-based anomaly features extracted from state estimation residuals be integrated into an extended Kalman filter framework to enhance multi-object tracking performance under sensor uncertainty?
• Does the persistence image of EKF residuals provide a discriminative topological signature for real-time anomaly detection in dynamic UAV environments?
• What mathematical relationship exists between EKF residuals and their topological invariants?
• How can EKF be strengthened using topological data analysis, and how can the results of learning algorithms be improved using the results of topological data analysis?
Thus, our work deliberately shifts the focus from purely data-driven model refinement to the rigorous application of probabilistic state estimation and mathematical modeling. Moreover, to address these research questions, we propose a novel framework named TopoEKF, which is from State-Space to Homology-based Adaptive Persistent Estimator, which integrates topological data analysis into the Extended Kalman Filter to enhance multi-object tracking and anomaly detection in UAV systems.
The proposed framework integrates persistent homology into the Extended Kalman Filter pipeline. This allows enabling the extraction of topological invariants from residual dynamics to identify and mitigate anomalies in real-time UAV tracking scenarios. This approach bridges the gap between statistical state-space modeling and geometric–topological data characterization, offering a mathematically grounded mechanism for anomaly-aware multi-object tracking. The main contributions of this work are as follows:
• We suggest a novel approach to bring together YOLO, EKF, and TDA so that we propose a new multi-object tracking pipeline fed with YOLO and EKF. The system includes trajectory tracking, persistence diagrams, vectorization with images, and anomaly detection stages, respectively.
• We propose an improved EKF framework called TopoEKF that integrates topological awareness through persistent homology–based feedback. Unlike conventional EKFs with static noise covariances, TopoEKF dynamically adjusts its process and measurement uncertainties according to measurement confidence, occlusion status, and the evolving topological structure of the trajectory. Unlike existing TDA-based approaches that primarily utilize persistent homology as a post-hoc analysis or feature extraction tool, the proposed TopoEKF framework integrates topological information directly into the state estimation loop. In particular, the extracted topological descriptors are not only used for anomaly detection but also actively regulate the EKF covariance matrices through a closed-loop feedback mechanism. This design enables capabilities that are not achievable with conventional EKF or standalone TDA-based methods. Specifically, without TopoEKF, the tracking system cannot adapt its uncertainty model based on trajectory-level geometric complexity, anomaly detection remains decoupled from the tracking process, and the filter becomes prone to drift and identity switches under complex motion and occlusion scenarios. The key novelty of TopoEKF lies not in the use of TDA itself, but in how topological information is embedded into the state estimation process as an active control signal rather than a passive descriptor.
• We develop an enhanced EKF formulation that incorporates an adaptive noise covariance mechanism, namely Adaptive EKF, specifically engineered to maximize robustness against the dynamic noise and occlusion inherent in UAV multi-object tracking scenarios.
• We explicitly distinguish three hierarchical models within our framework: The first one is the Standard EKF with fixed noise parameters, as the second the Adaptive EKF with confidence- and occlusion-driven covariance adaptation (Tier 1–2), and the final model is the proposed TopoEKF, which further incorporates topological feedback via persistent homology (Tier 3). Unlike the Adaptive EKF, TopoEKF captures global trajectory structure and enables topology-aware covariance updates, leading to improved identity consistency, robustness under occlusion, and anomaly-aware tracking. This formulation clearly isolates the unique contribution of TDA, demonstrating capabilities that cannot be achieved without topological integration.
• We establish an optimal hybrid architecture through the successful integration of the high-performance, real-time object detector YOLOv12 with classical filtering theory, achieving an optimal trade-off between perception accuracy and computational efficiency.
• We conduct a comprehensive and realistic evaluation benchmarking of our enhanced EKF against the dominant DeepSORT baseline, and a TDA-based Bytetrack Algorithm utilizing multiple industry-standard performance metrics on a complex hybrid dataset encompassing 35,700 frames of real-world UAV footage.
• We validate practical embedded deployment by demonstrating true real-time operational capability
The paper’s structure is as follows. Section 2 reviews the related literature, covering existing multi-object tracking paradigms, Kalman filtering approaches, and recent efforts that incorporate topological data analysis and anomaly detection. Section 3 presents the mathematical framework of the proposed system, beginning with the discrete-time state-space formulation and continuing with the enhanced Extended Kalman Filter, adaptive noise covariance modeling, data association strategy, and the formal definition of the proposed TopoEKF framework, including its state-to-topology and topology-to-covariance mappings. Section 4 describes the end-to-end system design and implementation, detailing the detection stage based on YOLOv12, the enhanced EKF tracking workflow, the hybrid integration strategy, and the topology-aware anomaly detection pipeline. The experimental setup is introduced in Section 5, where the datasets, evaluation metrics, hardware configuration, baseline methods, and the detailed TDA-based anomaly detection configuration are described. Section 6 reports and analyzes the experimental results, including quantitative performance comparisons, robustness under occlusion and noise, trajectory-level anomaly detection outcomes, and computational efficiency considerations. Section 7 discusses the broader implications of the proposed approach, highlighting the advantages of mathematically grounded state estimation, improvements in trajectory fidelity, and current limitations along with directions for future research. Finally, Section 8 concludes the paper by summarizing the main findings and contributions. NoteBookLM is used to analyze studies in the literature review. The authors have carefully reviewed and revised the output and accept full responsibility for all content.
This section presents the state of the art regarding multi-object tracking techniques, development and use of the extended Kalman filter, and application of the Kalman filter with TDA and anomaly detection, including TDA. Our work overcomes the limitations of existing state estimation and anomaly detection approaches in the literature by presenting a unique hybrid approach that combines YOLOv12 detection, an improved EKF-based tracking with adaptive noise covariance, and a novel topological data analysis-based pipeline including steps such as trajectory, persistence diagrams, vectorization, and anomaly detection successively. This integrated architecture provides significantly higher robustness to dynamic noise and occlusion issues, particularly those frequently encountered in UAV multi-object tracking scenarios, compared to previous TDA-based works. Furthermore, by striking an optimal balance between detection accuracy and computational efficiency, we comprehensively and realistically evaluated our improved EKF system on a hybrid dataset of complex real-world UAV imagery against the industry-standard DeepSORT baseline. For real-time operational capability, we validate our solution on an embedded platform by using NVIDIA Jetson AGX Xavier. This implementation provides a highly quantitative reduction in tracking computational complexity and demonstrates a critical advantage for field deployability in this field.
Table 1 denotes support and

2.1 Multi-Object Tracking Paradigms
Multiple Object Tracking algorithms are classically divided into two main categories: detection-by-tracking and joint detection-tracking. The traditional detection-by-tracking paradigm, which forms the basis of our approach, separates the data stream into a detection phase followed by a tracking phase that performs data association and state estimation. Early methods used techniques such as the Hungarian Assignment Algorithm for bilateral matching, the Joint Probabilistic Data Association Filter (JPDAF) [12,13] to incorporate detection uncertainty, or Multiple Hypothesis Tracking (MHT) for deferred decision-making. Joint detection-tracking methods, on the other hand, combine detection and tracking phases. Especially in extreme and complex tracking scenarios, studies have shown that detection-by-tracking methods, including the extended Kalman filter, provide more robust results.
The recent surge in deep learning has popularized methods like Simple Online and Realtime Tracking (SORT) [14] and DeepSORT [14], which leverage powerful appearance features extracted via Deep Neural Networks (DNNs) to improve data association. ByteTrack [15] as one of the most recent approaches, has further refined the association logic. While delivering state-of-the-art accuracy, these deep-learning-centric methods are resource-intensive, making them suboptimal for edge computing on resource-constrained UAVs.
Real-time tracking on unmanned aerial vehicles faces a significant technological bottleneck due to limited battery capacity and computational resources. While traditional Discriminant Correlation Filter (DCF)-based methods offer high throughput, they lack the robustness offered by DL-based trackers in complex scenarios [16]. However, the high computational costs of current DL-based trackers make their direct use in resource-constrained UAV platforms difficult [17]. To overcome this bottleneck, new energy-efficient paradigms and model compression techniques, such as Spiking Neural Networks (SNN) for RGB videos, have begun to be proposed [18]. These studies aim to meet the low power consumption requirements of the UAV while maintaining tracking accuracy.
In recent years, adaptive structures specifically designed for UAV tracking have been developed to improve the efficiency of Visual Transformers (ViTs). For example, Aba-ViTrack significantly reduces extraction time by eliminating unnecessary tokens through its method of detecting background and dynamically stopping token calculation based on input [16]. Similarly, the AVTrack framework [19] offers an adaptive paradigm that reduces computational load by selectively activating transformer blocks via an Activation Module (AM). Developed to further enhance efficiency, AVTrack-MD can create more compact tracking models without performance loss using multi-teacher knowledge distillation. AVTrack, on the other hand, increases tracking stability by developing representations resistant to appearance changes through mutual information maximization. These adaptive visual transformer approaches achieve real-time speed by optimizing computational resources according to the dynamic tracking needs of the UAV. Among other modern paradigms seeking solutions to computational bottlenecks, asynchronous feature extraction and layer pruning techniques stand out. In this context, LiteTrack [20] is a modern tracking paradigm based on layer pruning and asynchronous feature extraction techniques, developed specifically for lightweight and efficient visual tracking on resource-constrained platforms. This method aims to achieve high throughput on edge devices by optimizing computational load. LiteTrack provides high throughput in edge devices by lightening the network architecture and reducing latency through asynchronous operations. Furthermore, to overcome challenges such as occlusion, ORTrack [17] learns robust yet lightweight representations using spatial Cox processes and information distillation methods. All these lightweight tracking paradigms enable resource-constrained UAVs to perform both precise and high-speed tracking in challenging real-world conditions.
2.2 Kalman Filtering in Object Tracking
The Kalman Filter (KF), introduced in 1960 [21], provides an iteratively optimal solution for linear systems when both the process and measurement noise are Gaussian. The Extended Kalman Filter is a longer version that addresses nonlinear system dynamics by using a first-order Taylor series expansion around the current state estimate to locally linearize the system. While KF and its derivatives are old-fashioned methods, they are important because they are mathematically beautiful, always work, and are highly computationally efficient.
The widespread use of KF in modern tracking systems has been demonstrated by pioneering work such as SORT [14], which demonstrated how well Kalman filtering and the Hungarian algorithm for data association work together, and its successor, DeepSORT [14], which adds deep learning while maintaining the computational efficiency of KF-based motion estimation. Recent approaches like ByteTrack [15] and Bot-SORT [22] continue to use Kalman filtering as the primary motion model. Bot-SORT also uses camera motion compensation to control the dynamics of the UAV platform. The latest StrongSORT [23] improves upon DeepSORT by improving both appearance and motion modeling. This demonstrates that, when well-designed, traditional Kalman-based methods can compete with deep learning methods end-to-end.
Previous studies have examined EKF and its variants within the context of UAVs [24,25], yet these investigations typically concentrate on single-object scenarios or fail to deliver a comprehensive, real-time comparison of modern deep learning benchmarks on challenging datasets such as VisDrone or UAVDT. While lightweight tracking methods such as kernelized correlation filters (KCF) [26] hold promise for resource-constrained UAV platforms, they lack the robustness required for intensive multi-object scenarios. Our study clearly fills this gap by demonstrating that a carefully designed EKF system can provide an excellent balance of performance and efficiency for practical MOT on UAV platforms.
Recent developments have significantly improved EKF-based tracking with several key innovations. Reference [27] demonstrates indoor UAV localization without Global Positioning System (GPS) by integrating AprilTag and inertial measurement units (IMU) data with EKF for sensor fusion applications. In autonomous vehicle tracking, reference [28] presents an adaptable measurement noise model (EKF-RF) that considers the distance-dependent error characteristics of Lidar and radar sensors. For robust tracking under inconsistent target motion models, reference [29] proposes the Schmidt-EKF with robot-centered target representation for visual-inertial SLAMMOT (simultaneous localization, mapping, and moving object tracking) systems. Reference [30] develops a differentiable EKF framework that integrates movement models learned by a neural network for object tracking based on tactile sensors.
Several studies have addressed the limitations of traditional KF in handling nonlinearity and measurement uncertainties. The theoretical foundation for handling nonlinear systems is established by the Unscented Kalman Filter [31], which propagates uncertainty through nonlinear transformations without requiring Jacobian calculations. Building on this, reference [32] proposes an Adaptive Factored UKF (UKF-AF) that incorporates adaptive factors to adjust observation noise under outlier and occlusion conditions, achielving 4.75% FPS improvement and 2.30% accuracy gain on MOT16 dataset as an enhancement to DeepSORT. Adaptive noise covariance estimation, pioneered in GPS/INS systems [33], has been successfully applied to visual tracking by [34], who developed an Adaptive KF based on Autocovariance Least Squares (ALS) estimation to address performance degradation caused by incorrect noise statistics. Reference [35] introduces ConfTrack, which employs confidence score-weighted updates with Noise Scale Adaptive KF (NSAK) to penalize low-confidence detection boxes, achieving state-of-the-art HOTA and IDF1 metrics on MOT20 dataset. Reference [36] demonstrates practical real-time single-target tracking by combining Cam-Shift with improved KF for adaptive tracking window adjustment.
These advancements demonstrate that EKF, when augmented with variants such as Unscented Kalman Filter [31,32,37] and enhanced through sensor fusion [27,28] and adaptive noise covariance modeling [32–34], can achieve performance comparable to or exceeding deep learning-based systems [22,23,32] while maintaining computational efficiency. This provides strong justification for our solution that a meticulously designed EKF system can offer a superior performance-efficiency trade-off for practical MOT on UAV platforms.
The PlayNet study [38], presented in the field of sports analytics, is an innovative approach for real-time game classification, integrating EKF-based tracking with topological data analysis methods. This system offers a robust mathematical modeling approach based on Bayesian estimation principles, modeling agent movement patterns as a state vector and estimating position/velocity in multi-object tracking scenarios. PlayNet utilizes fuzzy topological data structure analysis techniques to transform game state representations into low-dimensional Kalman embeddings, which help capture structural signatures closely related to persistent homology. This low-latency architecture (under 55 ms) demonstrates real-time operational capability on embedded systems, and this efficiency, combined with high-performance sensing architectures such as YOLO, lays a significant foundation for demanding applications such as UAV tracking. The study, through the topological processing of complex motion data and the reliable state estimation capabilities of EKF, has laid the foundation for a model with the potential to detect structural anomalies beyond traditional tracking systems.
2.4 Anomaly Detection with TDA
Anomaly detection with TDA, unlike traditional methods, focuses on discovering global structural anomalies and inherent topological signatures in complex datasets. Topological Data Analysis and its core tool, persistent homology, provide a robust mathematical modeling framework that models the temporal structure of data like motion trajectories as dynamic graphs or delayed embeddings, particularly in multivariate time series and dynamic graph scenarios. This modeling is used to extract topological features such as connected components and loops/cycles and vectorize these features via persistence diagrams to generate unsupervised anomaly detection scores that detect deviations from normal behavior.
TDA-based algorithms provide complementary global structural information compared to classical Extended Kalman Filter approaches based on Bayesian estimation principles, while also being designed to be computationally efficient for post-processing deployment on embedded systems. In network traffic behaviors where cyber attackers exploit training data to launch backdoor attacks, the problem of classical ML methods being unable to distinguish between clean and poisoned data can be addressed with a TDA-based pre-filtering approach. Topological features extracted from network traffic using the gtda library are provided as input to unsupervised learning algorithms such as DBSCAN, HDBSCAN, and OPTICS, enabling the isolation of poisoned data into special clusters called Red [39]. Experiments have shown that the 72-feature model, in particular, can separate poisoned data with a 48.60% success rate, exhibiting superior performance compared to the 126-feature model. The results demonstrate that TDA significantly improves the security of ML-based intrusion detection systems by capturing micro-scale structural corruptions that are undetectable in raw data.
In the field of dynamic engineering systems and time series analysis, TDA features, especially persistent homology, produce more stable and noise-resistant results in real-time estimation of physical states compared to traditional methods such as the short-time Fourier transform (STFT). Razmarashooli et al. [40] successfully predicted the physical state of a system with a high correlation of
In object trajectory and transportation management scenarios, TDA’s approach such as using the tramoTDA framework [8], which focuses not only on the coordinates but also on the shape of the data, dramatically increases the discriminative power of ML models. Esteve and Falcó in [43] also provides TDA-CNN integration, increased accuracy by 38.49% and precision by 39.24% in hurricane intensity and marine navigation classification, surpassing traditional metrics. Indah et al., on the other hand, detected safe and aggressive driver behavior from highway trajectories with 96.8% overall accuracy using Persistence Images (PI) and the XGBoost classifier [44]. These results support the idea that topological tools such as PI and Wasserstein barycenter capture rare but critical risky driving patterns by removing noise.
In financial systems and high-dimensional network analysis, algorithms like Mapper and TADA (Topological Analysis for Detecting Anomalies) offer robust mathematical foundations for detecting global changes in complex dependency structures. Barberi and De Cave successfully isolated suspicious activities such as money muling from five statistically significant customer groups using AutoMATo clustering and Mapper on a massive dataset containing 1.4 million bank customers [45]. Chazal et al. present a scalable TADA framework that captures cross-channel correlation changes in multivariate time series using ATOL (Measure Vectorization for Automatic Topologically-Oriented Learning) vectorization and demonstrated superior performance in capturing complex correlation changes on the TimeEval benchmark set [7]. These studies prove that TADA provides effective segmentation and anomaly scoring even in unlabeled data environments where classical statistical assumptions are insufficient.
A general theoretical and methodological synthesis highlights that review articles in the literature emphasize that TDA adds a new depth to ML processes, known as topological machine learning. Tools like persistent homology, Betti numbers, and Mapper improve model accuracy and interpretability by up to 40% thanks to their noise-resistant and multi-scale analysis capabilities in high-dimensional data [46]. Results obtained by Du et al. using models like StrongSORT++ in multi-object tracking further reinforce the importance of trajectory consistency and global correlation in ML success, supporting the structural perspective offered by TDA [23]. In conclusion, TDA provides a more reliable and scalable feature set for ML models by extracting the geometric signature of the data.
3 Mathematical Framework: From State Estimation to Topological Analysis
This section develops the complete mathematical foundation of the proposed TopoEKF framework in a self-contained manner. We begin by defining the discrete-time state-space model and motion assumptions in Sections 3.1 and 3.2, then introduce the three-tier adaptive noise covariance mechanism in Sections 3.3 and 3.4, followed by the data association strategy in Section 3.5. Finally, Section 3.6 establishes the formal mapping from EKF state estimates to topological descriptors and their closed-loop feedback into the filter, completing the TopoEKF formulation.
To facilitate a clearer understanding of the mathematical formulations presented in this section, a summary of the notation, including state variables and adaptive scaling factors, is provided in Table 2.

3.1 Discrete-Time State-Space Model
The multi-object tracking problem is formalized as a decoupled collection of independent state-space systems, one for each tracked object. Assuming a 2D image coordinate system for measurements, the state vector
where
where
The measurement vector
Specifically,
The noise covariances
3.2 State Transition and Observation Models
To satisfy the computational constraints of real-time embedded deployment, a constant velocity motion model is adopted. This linear formulation eliminates the Jacobian recalculation required by nonlinear models, thereby reducing per-frame processing overhead. The state transition matrix
where
Thus, the measurement vector
With the motion model established, the following section details how
3.3 The Proposed Extended Kalman Filter Algorithm
Our proposed enhanced EKF formulation handles the slight nonlinearity of the image projection by operating under the linear CV assumption, yet its novelty lies in the ability to dynamically adapt the process noise
The recursive process is structured into two fundamental phases:
The Prediction Phase (Time Update). This prediction phase projects the state and covariance estimates from the previous time step
The prediction phase also incorporates scenario-adaptive process noise based on detected motion patterns:
where
This formulation prevents under-estimation of uncertainty for fast-moving objects, reducing prediction drift during sudden maneuvers.
The Update Phase (Measurement Update). Upon receiving a new measurement
The adaptive computation of
3.4 AdaptiveEKF: Adaptive Noise Covariance Modeling
The dynamic adaptation of the noise covariance matrices,
The Adaptive Process Noise (
This increase forces the predicted state covariance
Adaptive Measurement Noise (
A high confidence score yields a smaller
Our initial formulation proposed in [49] employed a static confidence-based scaling
To address this issue, we introduce a smoothed adaptive scaling with hysteresis via an exponential moving average (EMA) approach:
This formulation prevents abrupt
Smoothed Confidence-Based
where
The measurement noise adaptation now employs an exponential moving average to prevent rapid changes so that smoothing parameter (as hyperparameter selection)
•
•
•
As shown in Fig. 4a in Section 6.2, raw confidence scores
Velocity-Dependent Process Noise Augmentation (Tier 2 Extension). Furthermore, we augment the occlusion-based
This modification accounts for the fact that fast-moving objects (high
These refinements constitutes the key algorithmic improvements that enabled TopoEKF to achieve superior robustness across diverse traffic conditions, as evidenced by the 34% reduction in ID switches compared to Standard EKF as shown in Table 4 in Section 6.1.
Having fully specified the adaptive noise model, we next address the data association problem that maps detections to tracks before the EKF update is applied.
After establishing the adaptive noise covariance model, we now address the data association problem. In multi-object scenarios, correctly associating predicted states
A squared Mahalanobis distance exceeding a
Adaptive Mahalanobis Gating with Tier 2 Covariances. The gating threshold is now dynamically adjusted based on the adapted innovation covariance
where
Threshold Selection. The gating threshold is dynamically selected based on the uncertainty level:
• Standard:
• Under high uncertainty (
• Under low uncertainty (
This adaptive gating reduces false associations by 31% in cluttered scenes while maintaining 97% recall for valid matches.
With track-to-measurement associations resolved, the successive section introduces the topological analysis module that closes the feedback loop of the TopoEKF framework.
3.6 Topology-Aware EKF and Closed-Loop Formulation
The TopoEKF framework extends the EKF formulation established in between Sections 3.1 and 3.3 by incorporating topological information derived from tracked trajectories. Building directly on the state-space model and the three-tier adaptive covariance mechanism, the following subsections develop the formal mapping from EKF state estimates to topological descriptors and their closed-loop feedback into the filter.
3.6.1 Trajectory-to-Topology Mapping
An
Let
From the inclusion map
Persistent homology tracks the birth and death of topological features, such as connected components and loops in the sequence of simplicial complexes. During filtration, different topological features arise at certain filtration stages (referred to as their ’birth’) and disappear as the threshold increases (referred to as their ’death’). If
The persistence (lifetime) of
Persistence diagrams cannot be directly applied in machine learning and deep learning contexts. To achieve this, one of the most popular ways is to transform persistence diagrams into persistence images [50]. To construct a persistence image, firstly we transform birth-death coordinates
The persistence surface
where the weight
where
3.6.2 Topology-to-Covariance Mapping—(Tier 3)
To integrate topological awareness into the filtering process, we introduce adaptive scaling factors
with hyperparameters
The modified EKF covariances are thus given by
3.6.3 Composite Mapping and Topology-Aware Update
The complete topology-aware feedback chain can be written as the composite mapping
This composite mapping establishes a principled, bidirectional coupling between the geometric topology of observed motion trajectories and the statistical uncertainty model of the EKF. At each update cycle, the topological descriptor
All in all we can summarize the topology-to-filter mapping like the following:
• EKF state estimates
• A Vietoris–Rips filtration is constructed over
• Persistent homology extracts
• Scaling factors
•
4 TopoEKF: The Proposed End-to-End Pipeline Design and Implementation
TopoEKF operates as a fully integrated sequential pipeline for UAV-based anomalous vehicle detection as illustrated by the system overview in Fig. 1. Raw frames are first processed by a YOLOv12n detector, which produces per-vehicle bounding boxes and confidence scores that feed into a data association module. Each detection then passes through a three-tier adaptive Extended Kalman Filter, where measurement and process noise covariances are dynamically adjusted according to detection confidence, occlusion severity, and topological feedback, respectively. The EKF-filtered positions are accumulated in a per-track trajectory buffer of the last 50 frames, which is periodically analysed every five frames by a Topological Data Analysis module that constructs a Vietoris–Rips filtration and extracts persistent homology features which are most notably

Figure 1: Overview of the proposed TopoEKF pipeline for UAV-based anomalous vehicle detection.
As demonstrated in Fig. 1, we have a hierarchical methodology. To clarify the methodological distinctions, we explicitly define three progressively enhanced variants within our framework. The Standard EKF refers to the classical formulation with fixed process and measurement noise covariances. The Adaptive EKF (corresponding to Tier 1 and Tier 2) introduces dynamic noise covariance adjustment based on detection confidence and occlusion history, without incorporating any topological information. Finally, the proposed TopoEKF extends the Adaptive EKF by integrating topological feedback (Tier 3) derived from persistent homology, enabling trajectory-level adaptation through global structural features. This hierarchical formulation allows us to isolate the contribution of each component: while Adaptive EKF improves robustness against measurement uncertainty and occlusion, it lacks the ability to capture higher-order trajectory complexity. In contrast, TopoEKF leverages topological signatures to further reduce identity switches, improve occlusion recovery, and enable anomaly-aware tracking, which cannot be achieved without TDA integration. These distinctions are consistently reflected in our ablation study and experimental results.
The perception front-end is anchored by a meticulously optimized, high-speed YOLOv12 model, which currently represents the state-of-the-art in real-time object detection speed and accuracy. The model processes the input frame, generating a set of high-confidence bounding box coordinates and scores that serve as the fundamental observation vector
4.2 Tracking Stage: Enhanced Adaptive EKF Workflow
Upon receiving the detections, the tracking module systematically orchestrates the life cycle of each object track. The process begins with Track Initialization, where new, high-confidence detections unassociated with existing tracks spawn a new EKF instance. This is followed by the Prediction phase, where all active EKF tracks project their states using the motion model. The subsequent Association step employs the Hungarian algorithm based on the Mahalanobis distance to find the optimal pairing between predicted states and current detections. Update and Management concludes the cycle: matched tracks update their states using the EKF equations, while unmatched predictions are placed into a tentative or lost state. Tracks failing to be consistently detected for a predefined number of frames are efficiently deleted to prevent the accumulation of ghost tracks.
4.3 Hybrid Integration Synergy
The effectiveness of our hybrid system lies in the synergistic combination of complementary strengths. YOLOv12 excels at the challenging non-linear perception tasks, robustly recognizing objects despite variations in appearance, scale, and partial occlusion through learned representations. Conversely, the EKF provides the essential temporal consistency, motion prediction during occlusions, and a mathematically sound, computationally lightweight mechanism for state estimation. This integrated approach achieves a superior level of robustness and efficiency compared to methods that rely exclusively on either deep learning for all components or classical methods with simplistic motion models.
4.4 End to end Anomaly Detection Stage Based on TopoEKF
The proposed TopoEKF framework incorporates a TDA module that operates on the EKF-filtered positions rather than the raw YOLO detections. For each tracked object, the last
The proposed TopoEKF pipeline consists of nine tightly coupled modules, each responsible for a distinct function within the multi-object tracking process. Algorithms A1–A9 describe the sequence from object detection to topology-aware state estimation, forming a closed, adaptive feedback loop between perception and estimation.
Algorithm A1 defines the overall tracking pipeline combining object detection, data association, Kalman-based state estimation, and topology-driven adaptation. At each frame, YOLOv12 detections are matched with predicted tracks using a Mahalanobis distance-based association strategy. Extended Kalman Filters are updated for matched objects, while unmatched detections trigger new track initialization. A topological feedback loop, computed periodically, modifies process and measurement noise terms to enhance robustness against occlusions and nonlinear motion.
The three-tier adaptation mechanism explained in Algorithm A2 extends the conventional EKF update by introducing confidence, occlusion, and topology-aware corrections. Measurement noise covariance (
Algorithm A2 includes refined adaptation logic regarding with
Tier 1 represents a measurement-level adaptation mechanism in Algorithm A2, where the EKF measurement noise covariance is adjusted according to the confidence of incoming detections. By scaling
Tier 2 corresponds to a motion-level adaptation in Algorithm A2 and accounts for temporary signal loss and occlusion. In this tier, the process noise covariance
Tier 3 introduces a trajectory-level adaptation implemented through Algorithm A3, where persistent homology is applied to EKF-filtered trajectories. The extracted topological features quantify the structural complexity of object motion and are used to modulate both
Algorithm A3 introduces topological reasoning into the tracking loop. It computes the Persistent Homology of recent trajectory points stored in a buffer to capture geometric and dynamical invariants, such as the number and lifespan of trajectory cycles. The resulting factors,
Algorithm A4 computes the Mahalanobis distance between predicted track positions and new detections to measure statistical compatibility. A gating mechanism based on the 99% confidence threshold (
When an unmatched detection with sufficient confidence is encountered, this routine instantiates a new tracking object. It initializes the state vector with the detected position and zero velocity and assigns high initial covariance to reflect uncertainty. Baseline process and measurement noise matrices (
Algorithm A6 tracks objects in the video stream temporally using YOLO-based detection and TopoEKF tracking mechanisms. Highly representative trajectory vectors are generated by extracting topological data analysis features from sufficiently long tracks. Unsupervised anomaly detection is performed using these features, distinguishing between normal and abnormal movements.
Algorithm A7 extracts topological features based on persistent homology by treating each object trajectory as a point cloud. Betti numbers, lifetime statistics, and optionally, persistence image representations are calculated from diagrams of
Algorithm A8 performs unsupervised anomaly detection using an Isolation Forest model by scaling the extracted TDA features. An anomaly label and anomaly score are generated for each trajectory, and the results are matched with track IDs. Additionally, summary statistics such as the number of anomalous and normal samples and decision thresholds are calculated system-wide.
Algorithm A9 reduces high-dimensional TDA features to two dimensions using PCA, enabling visual analysis of anomaly results. Normal and anomalous examples are shown with different colors and symbols in the feature space, while trajectories are simultaneously visualized in a spatial plane. This facilitates the interpretation of anomalies from both behavioral and geometric perspectives.
The system undergoes rigorous evaluation on a deliberately constructed hybrid dataset designed to simulate the diverse and challenging conditions encountered in real-world UAV operations. The dataset combines the general object diversity of COCO [51], the high density and small object challenges of VisDrone [52,53], the extended sequences under diverse weather and lighting of UAVDT [54], Road_Anomaly_Dataset [55], and Detection of Traffic Anomaly (DoTA) dataset [56]. The resulting test set, comprising
To construct a semantically consistent training corpus and mitigate potential bias, we adopt a label-space unification strategy that harmonizes COCO, VisDrone, and UAVDT through a shared ontology, retaining only the six categories common to all sources—car, truck, bus, pedestrian, bicycle, and motorcycle—while discarding dataset-specific labels. Class consistency is strictly enforced by normalizing all bounding box coordinates to a
Our hybrid evaluation dataset comprises 87 video sequences totaling 35,700 frames (Train: 70%, Test: 15%, Validation: 15%) with the following source distribution:
• COCO subset: 12 sequences, 4200 frames (general object diversity baseline)
• VisDrone: 42 sequences, 18,500 frames (high-density, small object challenges)
• UAVDT: 28 sequences, 11,200 frames (weather/lighting variations)
• DoTA Traffic: 3 sequences, 1200 frames (normal intersection traffic)
• Custom accident scenarios: 2 sequences, 600 frames (DoTA Dataset, Road_Anomaly_Dataset)
Fig. 2 presents four representative frames from a traffic intersection dataset used for trajectory tracking and anomaly detection. The samples include diverse traffic scenes, such as daytime, nighttime, and high-angle views, to ensure robust detection and tracking under varied conditions and support anomaly analysis, such as sudden stops and directional changes.

Figure 2: Sample frames from the intersection dataset used for traffic monitoring and anomaly detection. The top row shows daytime scenes, while the bottom row includes a nighttime view and a high-angle bird’s-eye perspective.
Table 3 presents categorically important statistics about the hybrid dataset.

Labeling Protocol: Each trajectory is independently reviewed by two annotators. Anomalies are identified based on (i) visual inspection of the spatial trajectory, (ii) velocity profile analysis using an acceleration threshold exceeding
Label-Space Unification: The datasets employed in this study which are VisDrone2019, UAVDT, COCO adopt heterogeneous class taxonomies. To enable unified training and fair cross-dataset evaluation, we perform label-space unification by mapping all dataset-specific categories onto a common schema covering the primary object classes of interest: person, vehicle, bicycle, and background/other. Classes with semantic overlap across datasets such as pedestrian in VisDrone and person in COCO are merged, and classes outside the unified schema are discarded. Road_Anomaly_Dataset and DoTA dataset are incorporated exclusively for anomaly detection evaluation and do not contribute to the detection training phase; their annotation schema are therefore treated independently and are not subject to the label-space unification procedure. For these datasets, raw frames are resized to the inference resolution (640 × 640) and normalized using dataset-wide statistics. Anomaly category labels are mapped to a binary schema (anomaly/non-anomaly) consistent with the evaluation protocol described in Section 5.5.
Stratified Sampling: To mitigate the class imbalance inherent in UAV and traffic imagery (where vehicle instances dominate over person and bicycle across all datasets), we applied stratified sampling during training set construction. Specifically, the per-class sampling ratio was adjusted so that each class contributes proportionally to the training batches, preventing the detector from being biased toward the majority class.
Data Augmentation: Standard augmentation operations were applied during YOLOv12n training, including random horizontal flipping, mosaic composition (combining four images into one), HSV jitter (hue, saturation, value perturbation), random scaling and translation, and copy-paste augmentation for small-object enhancement, which is a particularly relevant strategy for UAV and anomaly datasets where targets are frequently small and densely packed. These operations are applied online during training and do not alter the original dataset splits. The relevant paragraph has been added to the Dataset section of the revised manuscript.
To ensure a comprehensive and objective assessment, we employ standard metrics established by the MOT community, including the primary metric MOTA (Multiple Object Tracking Accuracy), which accounts for false positives, false negatives, and identity switches, providing an overall measure of tracking quality. MOTP (Multiple Object Tracking Precision) measures the average bounding box alignment, while the ID Switch Rate is critical for assessing the long-term identity maintenance capability. Finally, runtime complexity FPS quantifies the real-time capability on the embedded platform.
5.3 Hardware and Implementation
All experiments are conducted on an NVIDIA Jetson AGX Xavier, a high-end embedded platform representative of typical UAV processing units. The platform operates within a highly constrained 18 W power envelope. The system is implemented using Python 3.8, leveraging optimized libraries such as NumPy for efficient matrix operations and OpenCV for video processing, ensuring an efficient and scalable codebase.
The main comparative baseline is DeepSORT [14], a highly competitive and widely adopted tracking-by-detection algorithm. DeepSORT integrates a standard Kalman Filter with a deep learning re-identification (Re-ID) network for appearance feature extraction, representing the contemporary state-of-the-art and serving as a robust measure for comparative analysis. Moreover, we implement the TDA-based ByteTrack algorithm to make a comparison with a tracking model based on topological data analysis.
5.5.1 End-to-End TDA Pipeline Configuration
The anomaly detection module operates as a post-processing stage on the trajectories generated by the enhanced EKF tracker. The complete pipeline transforms raw state estimates into topological signatures suitable for unsupervised anomaly classification.
5.5.2 Data Flow and Transformation
Input Stage: For each tracked object, the Enhanced EKF maintains a trajectory buffer
Trajectory Filtering: Only trajectories with a minimum length of 30 frames are subjected to TDA analysis to ensure sufficient topological information. This filtering typically retains 65%–75% of all tracks, resulting in approximately 50–90 valid trajectories per video sequence for analysis.
Point Cloud Construction: Each trajectory is represented as a point cloud
5.5.3 Persistent Homology Computation
Vietoris–Rips Filtration: For each trajectory point cloud
Persistence Diagram Extraction: The output consists of two persistence diagrams:
Persistence Image Transformation: To enable machine learning classification, persistence diagrams are converted to fixed-dimensional feature vectors using persistence images. We employ a
Statistical Feature Augmentation: In addition to persistence images, we extract 20 statistical features from each persistence diagram, including:
• Betti numbers:
• Persistence statistics: mean, standard deviation, maximum, sum of lifetimes
• Birth/death statistics: mean, median, quartiles (25%, 75%)
• Structural features: entropy, normalized life expectancy
The concatenated feature representation yields a final vector space
5.5.5 Dimensionality and Computational Complexity
Feature Matrix Dimensions: For a typical video sequence, the TDA feature extraction produces a matrix
Computational Budget:
• Per-trajectory processing: 15–35
• Total batch processing for 70 trajectories:
• Amortized per-frame overhead: 0.8–1.2
This computational cost represents only 2.8% of the total frame processing time (35
The TDA frequency
5.5.6 Anomaly Detection Classifier
Isolation Forest Configuration: We employ Isolation Forest with the following parameters:
• Number of estimators: n_estimators = 100.
• Contamination rate: contamination = 0.15 (assuming 15% anomalous trajectories).
• Subsampling: max_samples = 256.
• Random state: seed = 42 for reproducibility.
Training and Inference: The classifier is trained in an unsupervised manner on the entire feature matrix
5.5.7 Evaluation Metrics for Anomaly Detection
Performance is assessed using Precision as
Ground truth anomaly labels are established through manual annotation, identifying trajectories exhibiting: The anomalous trajectories are characterized by erratic motion patterns, such as sudden direction changes exceeding
5.5.8 Integration with TopoEKF
The anomaly detection module operates in two modes. The first one Real-time mode: Online classification during tracking, triggering alerts for anomalous trajectories. Batch mode: Offline analysis post-tracking, generating comprehensive anomaly reports.
In real-time mode, topological features
6.1 Quantitative Performance Comparison
Table 4 quantitatively compares the proposed TopoEKF framework with the standard EKF, DeepSORT, and TDA-based ByteTrack on the hybrid UAV dataset. TopoEKF achieves the highest tracking accuracy, improving MOTA from

In addition, TopoEKF demonstrates markedly better robustness under occlusion, increasing the recovery rate from
The ByteTrack+TDA baseline is also constructed by integrating a TDA module post-hoc onto the standard ByteTrack track management structure. In this configuration, ByteTrack utilizes a low confidence threshold (conf = 0.25) to bifurcate all detections into two distinct pools; it performs high-reliability matching in the first stage, while the second stage associates low-confidence detections with IoU-based Kalman filter predictions to recover lost tracks. For each track, the TDA module generates a point cloud from the EKF-filtered positions over the last 50 frames to compute persistent homology features, specifically
The experimental results demonstrate that while the TDA-based ByteTrack baseline improves occlusion recovery through its dual-threshold association and post-hoc topological anomaly detection, it remains fundamentally limited by its loosely-coupled architecture. In this configuration, topological features extracted via Vietoris–Rips filtration are utilized only for trajectory validation via an Isolation Forest, failing to influence the underlying motion model. Consequently, it achieves a lower MOTA (62.9%) and higher RMSE (8.7 pixels) compared to the proposed TopoEKF, as the latter implements a tightly-coupled feedback loop that directly modulates the process noise covariance (
Fig. 3 illustrates a YOLO-based detection overlay on aerial intersection frames, where each red bounding box marks a vehicle detected across consecutive frames which highlights consistent localization despite occlusions and perspective shifts. Red bounding boxes indicate detected vehicles across frames, demonstrating the robustness of detection under varying occlusion and perspective conditions. This reliable detection output forms the basis for trajectory tracking via an Enhanced EKF, enabling the extraction of accurate spatio-temporal paths. These trajectories are subsequently transformed into point clouds for topological data analysis, where persistent homology signatures can sensitively reveal anomalies such as unusual stopping, looping, and erratic motion patterns.

Figure 3: A YOLO-based vehicle detection overlay on aerial intersection imagery.
6.2 Robustness Analysis under Occlusion and Noise
A rigorous analysis of performance under varying occlusion levels confirmed the efficacy of the adaptive noise modeling. While both methods perform comparably under Low Occlusion (0%–25%), the EKF demonstrated its initial advantage under Medium Occlusion (25%–50%), achieving a
Fig. 4 illustrates how the Enhanced EKF dynamically adapts its covariance scaling in three complementary tiers to maintain robust tracking under challenging conditions. Fig. 4a shows confidence-based adaptation: as detection confidence (

Figure 4: Three-tier adaptation mechanism in the Enhanced EKF: (a) confidence-based adjustment (
Fig. 5 provides a comprehensive comparison across 50 tracks, showing that TopoEKF consistently reduces RMSE and drift while achieving an average improvement of over 50% compared to the standard EKF. The improvement is particularly pronounced in trajectories with higher occlusion and increased topological complexity (higher

Figure 5: A 6-panel diagram including RMSE, drift, improvement, occlusion relationship and topological complexity analysis on 50 tracks.
6.3 Trajectory Level Analysis and Anomaly Detection Results Based on TopoEKF
The trajectory data are first processed through the Ripser library to compute the corresponding persistence diagrams. For instance, consider a trajectory that produces three topological cycles with lifetimes of
This unified feature vector is then used as input to an Isolation Forest model, which effectively distinguished between normal and anomalous trajectory patterns.
Fig. 6 presents the results of the proposed TopoEKF framework on accident and anomaly detection datasets, illustrating three distinct traffic scenarios along with their corresponding topological data analysis representations. In the normal traffic scenario (top row), the vehicle trajectory exhibits smooth and monotonically increasing motion, shown by a green path. The corresponding

Figure 6: TopoEKF results on hibrit dataset including accident and anomaly detection datasets.
The near-miss or swerving scenario (middle row) is characterized by an abrupt change in direction, represented by an orange trajectory. In this case, the
Finally, the accident or collision scenario (bottom row) demonstrates a complex trajectory concentrated around the collision point, depicted in red. The
Overall, these results demonstrate that TopoEKF effectively leverages topological signatures to distinguish between normal and anomalous traffic behaviors across varying levels of complexity.
Fig. 7 illustrates the superiority of TopoEKF over the Standard EKF in terms of positioning accuracy, evaluated using four complementary metrics. The spatial trajectory plot (top left) shows the ground truth trajectory as a black dashed line, alongside the Standard EKF (red) and TopoEKF (green) estimates, with the occlusion period highlighted as a thick red segment. During occlusion, the Standard EKF exhibits a pronounced deviation from the ground truth, reaching a maximum error of approximately 50 pixels, whereas TopoEKF remains closely aligned with the true trajectory.

Figure 7: Positioning Accuracy*TopoEKF’s superiority in positioning accuracy compared to standard EKF is demonstrated by four different metrics.
The position error over time (top right) further highlights this behavior. Throughout the occlusion interval (gray shaded region, frames 40–55), the error of the Standard EKF increases to over 40 pixels, while TopoEKF maintains a stable error of approximately 5 pixels. This robustness can be attributed to the adaptive scaling of the
The position error distribution (bottom left) reveals that TopoEKF’s errors are highly concentrated within the 0–2 pixel range, as indicated by the green histogram, whereas the Standard EKF produces a broad-tailed distribution spanning approximately 5–30 pixels.
Finally, the cumulative position error over 100 frames (bottom right) shows that the Standard EKF accumulates an error of approximately 15 pixels, while TopoEKF stabilizes around 3 pixels. This corresponds to an overall improvement of roughly 80% in cumulative positioning accuracy.
Fig. 8 encapsulates the end-to-end anomaly detection performance achieved via topological feature extraction from tracked trajectories. In Fig. 8a, PCA reveals clear separation between normal and anomalous trajectories in the feature space. Here, persistence-based representations effectively capture motion irregularities. Fig. 8b shows how the Isolation Forest model assigns higher anomaly scores to true anomalies, falling above the detection threshold, while normal trajectories remain below it. Fig. 8c, the confusion matrix, reports 38 true negatives, 2 false positives, 9 true positives, and 1 false negative, demonstrating strong classification accuracy with minimal misclassification. This validates the robustness of the integrated pipeline from YOLO detection and enhanced EKF tracking, through persistence image generation, to topologically informed anomaly classification in identifying a typical vehicle behaviors under diverse traffic conditions.

Figure 8: Anomaly detection results using a TDA-based feature extraction pipeline followed by Isolation Forest classification. (a) PCA projection of trajectory-level features, with normal trajectories shown in green and anomalies in red. (b) Isolation Forest anomaly score distribution across trajectories, with a horizontal dashed line indicating the decision threshold. (c) Confusion matrix comparing true vs. predicted labels.
Fig. 9 illustrates the topological feature extraction process applied to a circular motion trajectory. Fig. 9a, the Vietoris–Rips complex is constructed at increasing filtration radii (

Figure 9: Persistent homology extraction from a circular trajectory. (a) Vietoris–Rips complex filtration visualized at increasing scales (
By leveraging this topological insight, the pipeline can assign higher anomaly suspicion to motion paths exhibiting loops or cyclic behavior which is information that complements YOLO detections and Enhanced EKF tracking to yield a more robust trajectory analysis framework.
Our experiments are based on 5-fold cross-validation of the anomaly detection component across the full dataset. The per-fold results of the 5-fold cross-validation are applied to the anomaly detection module (Isolation Forest on TDA features) across the full dataset of 87 sequences (35,700 frames), partitioned as 70% training, 15% validation, and 15% test. Across all folds, the model achieves consistent performance, with Precision ranging from 0.831 to 0.855, Recall from 0.814 to 0.841, and F1-Score from 0.822 to 0.848, yielding mean values of Precision
Furthermore, our experimental results also present the sensitivity of the Isolation Forest component to random initialization, evaluated across five independent runs with different random seeds (random_state
6.4 Computational Efficiency and Power Consumption
Computational profiling on the Jetson AGX Xavier clearly indicates the source of the speed disparity. The shared overhead YOLOv12 Detection is
Fig. 10 delivers a comprehensive performance profile of the TDA-augmented tracking framework. Fig. 10a demonstrates that TDA computation time scales roughly in line with

Figure 10: Performance analysis of the TDA-augmented tracking pipeline. (a) TDA computation time as a function of trajectory length, with measured data (blue) and an
The per-frame computational cost of TopoEKF is dominated by three components. Firstly, YOLOv12n detection:
6.5 Impact of TDA-Based Anomaly Detection
We adopt a hybrid feature extraction strategy that integrates topological, statistical, and geometric descriptors. First, the topological feature set consists of 400-dimensional persistence images derived from
The resulting 420-dimensional feature vector is subsequently provided as input to an Isolation Forest model. The algorithm is configured with
This ablation study is specifically designed to provide a direct comparison between Standard EKF, Adaptive EKF (without TDA), and the proposed TopoEKF (with TDA), thereby explicitly addressing both ”EKF vs. Adaptive EKF vs. TopoEKF” and ”with vs. without TDA” settings.
To quantify the contribution of TDA-augmented tracking, a comparative ablation study is conducted across three different configurations, as illustrated in Fig. 11.

Figure 11: Comparison of standard EKF, adaptive EKF (Tier 1+2, without TDA), and full TopoEKF (with TDA), explicitly illustrating both model-level and TDA-level contributions.
The first configuration corresponds to the Standard EKF setup without any TDA integration. In this baseline setting, the process and measurement noise covariance matrices (
The second configuration represents the intermediate version of the proposed method, namely Adaptive EKF which is TopoEKF just with Tier 1 and Tier 2 enabled but without TDA feedback. In this case, the filter incorporates confidence-aware and occlusion-based adaptive mechanisms while excluding topology-driven updates. This configuration improves tracking performance, yielding a MOTA of 76.1% and reducing the number of identity switches to 178. However, anomaly detection is not implemented in this setting, as no topological features are extracted.
The third and final configuration corresponds to the Full TopoEKF framework, in which Tier 1, Tier 2, and Tier 3 are all active and topological data analysis is fully integrated into the tracking loop. This complete topology-aware adaptation achieves the highest overall tracking performance, with a MOTA of 76.3% and a substantially reduced number of identity switches, equal to 142. Moreover, the incorporation of persistent homology-based features enables effective anomaly detection, resulting in an F1-score of 84.2%.
As indicated in Fig. 11:
MOTA vs. Occlusion Level (Low, Medium, High, and Complex): Three bar groups showing degradation under increasing occlusion:
• Standard EKF (red): 88.2%
• Tier 1+2 Only (Adaptive EKF) (orange): 89.5%
• Full TopoEKF (green): 90.1%
According to results, under high occlusion, Tier 3 (TDA) provides additional 3.2 pp improvement.
Contribution Analysis: Stacked bar chart showing MOTA improvement attribution:
• Tier 1+2 contributes 4.5 pp (Standard 72.8%
• Tier 3 (TDA) contributes 2.7 pp (77.3%
Fig. 12 presents a scatter plot of 50 representative trajectories that illustrates the relationship between occlusion duration and trajectory estimation accuracy. In this visualization, the horizontal axis denotes the number of occlusion frames encountered by each track, while the vertical axis represents the percentage improvement in root mean square error (RMSE). The color coding corresponds to the

Figure 12: Trajectory quality metrics (adaptive EKF-No TDA case).
Without topological data analysis, trajectories exhibiting higher structural complexity, specifically those with
Important findings from the experiments:
The experimental results demonstrate a clear improvement in trajectory stability as successive adaptation layers are enabled within the proposed framework as demonstrated in Table 5. Specifically, the RMSE result is reduced from 12.4 pixels under the standard EKF configuration to 9.1 pixels when Tier 1 and Tier 2 adaptations are applied, and further decreases to 7.8 pixels with the full TopoEKF formulation. A similar trend is observed for trajectory drift, which decreases from 0.42 pixels per frame to 0.23 pixels per frame with confidence- and occlusion-aware adaptation, and is further reduced to 0.18 pixels per frame when topology-aware feedback is incorporated.

In terms of anomaly detection performance, the integration of topological features leads to a substantial improvement over rule-based heuristics. Without TDA, anomaly detection achieves a precision of 62%, a recall of 71%, and an F1-score of 66%. When topological descriptors derived from persistent homology are utilized and processed via an Isolation Forest classifier, precision increases to 87%, recall to 82%, and the overall F1-score to 84%, highlighting the discriminative power of topology-aware representations.
The impact of TDA varies across different motion scenarios. For normal traffic patterns characterized by trajectories with
An ablation study further reveals the critical role of topology-aware feedback. When the TDA-driven adaptation layer (Tier 3) is disabled while trajectory logging remains active, the number of identity switches increases by approximately 20%, rising from 142 to 178. Additionally, the occlusion recovery rate decreases from 84.2% to 76.8%. These results confirm that topological adaptation provides robustness beyond what can be achieved through confidence- and occlusion-based mechanisms alone.
Finally, the computational cost associated with the proposed TDA integration remains minimal. The persistent homology computation, including Ripser and persistence image generation, requires approximately 0.8 ms per trajectory every five frames, corresponding to an amortized per-frame overhead of 0.12 ms. This represents only 0.4% of the total 35 ms end-to-end pipeline runtime, thereby validating the practical feasibility of real-time topology-aware tracking on resource-constrained platforms.
7 Discussion and Practical Implications
7.1 Selection of the Object Detection Algorithm
Although the primary focus of this work is on the object tracking component rather than object detection, the choice of the detection backbone is critically important, as detection quality directly affects identity preservation, trajectory continuity, and overall tracking stability in multi-object tracking frameworks.
The selection of YOLOv12 as the object detection backbone in this study is grounded in substantive architectural advances that distinguish it from all prior YOLO-family models. YOLOv12 represents a fundamental architectural departure by introducing an attention-centric framework that, for the first time, matches the inference speed of CNN-based detectors while fully exploiting the representational superiority of attention mechanisms [47]. This is achieved through three key innovations: the Area Attention (A2) module, which preserves a large receptive field while reducing computational complexity; Residual Efficient Layer Aggregation Networks (R-ELAN), which resolve the optimization instability inherent in large attention-based architectures; and FlashAttention, which eliminates memory bottlenecks during inference. These advances translate into measurable accuracy gains: YOLOv12-n achieves 40.5% mAP at 1.62 ms latency on a T4 GPU, surpassing YOLOv10-n and YOLO11-N by 2.0% and 1.1% mAP, respectively, at comparable speeds. At the small-model scale, YOLOv12-s outperforms all of YOLOv8-s, YOLOv9-s, YOLOv10-s, and YOLOv11-s, while also exceeding end-to-end detectors such as Real-time Detection Transformer (RT-DETR) in both accuracy and computational efficiency.
Of particular relevance to embedded deployment contexts, YOLOv12’s A2C2F module fuses multi-head multi-layer perceptron (MLP) blocks with localized area-attention to strengthen spatial feature learning under lightweight computation constraints, while the C3K2 module further reduces convolutional complexity without sacrificing detection capability [57]. The multi-scale feature fusion strategy which is achieved by stacking A2C2F blocks with Concat and Upsample operations preserves high-resolution representations essential for small-scale object detection, a persistent challenge in UAV and drone-mounted vision systems [57]. Empirical validation confirms real-world viability: a YOLOv12 configuration augmented with R-ELAN and FlashAttention achieved 84.6% mAP@50 at 14 ms inference speed in a real-time pipeline [58]. The model further demonstrates robustness across diverse operational conditions, including variable weather, lighting, and geographic scenarios, establishing its suitability for large-scale deployment in intelligent transportation and embedded monitoring systems.
7.2 The Intrinsic Advantages of Mathematical State Estimation
The experimental validation substantiates that a mathematically rigorous approach yields compelling advantages in the specific domain of UAV MOT. The EKF is an interpretable white-box model where every parameter, from the state vector
7.3 Justification of Avoiding Jacobian Computations and Constant Velocity
Jacobian computation. In the standard EKF formulation, the Jacobian
where
Constant velocity model. At UAV operating frame rates of
EKF vs. standard KF. Although the state transition is linear, a standard Kalman filter would be insufficient for two reasons. First, our three-tier adaptive noise model introduces nonlinear dependencies in the effective covariance matrices:
7.4 Refining the Fidelity of the Trajectories
The effectiveness of TDA fundamentally depends on the quality of the trajectory data it receives. In a standard EKF, fixed error covariances are maintained even under noise, signal loss, or unreliable measurements. As a result, the generated position estimates gradually drift, producing topological noise that misleads the TDA module into perceiving false anomalies. In contrast, the proposed TopoEKF employs an intelligent three-tier adaptive mechanism that dynamically adjusts its error covariances based on measurement confidence and signal conditions. This enables TopoEKF to yield smoother, more stable, and more physically accurate trajectory data even under challenging conditions, allowing the TDA to detect true anomalies with significantly higher precision. In other words, we do not change the nature of TDA itself. Here, we improve the quality of the data it consumes.
While the UKF offers theoretical accuracy advantages for strongly nonlinear systems, our near-linear constant-velocity model renders the EKF linearization error negligible at the employed frame rates. Furthermore, the
7.5 Limitations and Future Research Directions
Despite its strong performance, the Constant Velocity model imposes a primary limitation: a
One of the possible future works involves implementing the Interacting Multiple Model (IMM-EKF). This advanced framework maintains a bank of motion models such as CV, Constant Acceleration, and Coordinated Turn and dynamically weights their estimates based on which model best explains the current motion, thereby resolving the high-dynamic maneuver issue while retaining the core EKF efficiency. Another promising direction is Multi-Sensor Fusion, where the EKF framework can naturally integrate observations from multiple non-cooperative UAVs or other sensors, enabling robust 3D trajectory estimation and highly resilient occlusion handling.
A current limitation of the proposed framework concerns extreme occlusion scenarios in which a target remains undetected for an extended duration. As specified in Algorithm A1, a trajectory is deleted when its miss_count exceeds the threshold of 10 consecutive frames, which may lead to track loss and subsequent identity switches for objects occluded beyond this horizon. While the three-tier adaptive EKF mitigates short-to-medium occlusion by inflating the process noise covariance
The current EKF state-space formulation operates in the 2D image plane, as the monocular camera used in this work does not provide reliable depth measurements. Extending the framework to a 3D state-space
This study successfully developed and rigorously evaluated an enhanced Extended Kalman Filter framework for multi-object tracking from Unmanned Aerial Vehicles. By judiciously coupling a state-of-the-art detector, YOLOv12, with a mathematically optimal and efficient state estimator featuring adaptive noise covariance, we have created a hybrid system that outperforms the state-of-the-art DeepSORT in crucial operational metrics.
In this study, the EKF as a multi-object tracking method is enhanced through a TopoEKF framework that incorporates topological awareness and adaptive error modeling. This framework utilizes a three-layer track update mechanism, allowing the filter to adapt based on measurement confidence and topological complexities. A significant advancement is the direct incorporation of topological data analysis into the filtering process, transforming TDA from a post-analysis tool to an active feedback mechanism in the EKF adjustment. As a result, the system achieves both measurement-based and shape-based error corrections. Experimental results show that TopoEKF yields more stable and meaningful trajectories, improving anomaly detection accuracy and reducing false positives. This research establishes a new relationship between perception and state estimation with potential for future integration of topological features and deep learning in three-dimensional spatial monitoring.
We demonstrated a 20% improvement in tracking robustness under high occlusion and achieved a real-time frame rate of 28.5 FPS on a resource-constrained embedded platform. The work serves as compelling proof that for practical, safety-critical embedded systems, the most effective solution is not an absolute choice between classical mathematics and modern deep learning, but a principled integration of the two. Deep learning provides robust perception, while mathematically grounded filters offer the necessary temporal coherence, efficiency, and system transparency required for reliable autonomous operations.
TopoEKF is positioned as a principled bridge between classical state estimation and modern data-driven analysis by embedding topological structure directly into the filtering loop, enabling trajectory-level complexity modeling and adaptive uncertainty handling beyond conventional and adaptive EKF frameworks. Rather than offering only incremental improvements, it reframes multi-object tracking as a geometry-aware inference problem, enhancing robustness, interpretability, and anomaly awareness in safety-critical UAV scenarios. Extending this framework toward end-to-end learning of topology-aware adaptation and integration with next-generation detection architectures remains a promising avenue.
Acknowledgement: This paper is partially presented at the International Conference on Mathematics and Applied Data Science (ICMADS’25), August 29–31, 2025, Konya, TÜRKİYE. NoteBookLM is used to analyze studies in the literature review. The authors have carefully reviewed and revised the output and accept full responsibility for all content.
Funding Statement: The authors received no specific funding for this study.
Author Contributions: Rabia Kıratlı: Conceptualization, Visualization, Methodology, Investigation, Writing—review & editing, Validation, Software, Formal analysis. Hatice Ünlü Eroğlu: Conceptualization, Writing—Review & editing, Validation, Supervision. Alperen Eroğlu: Conceptualization, Visualization, Writing—original draft, Writing—review & editing, Validation, Supervision. All authors reviewed and approved the final version of the manuscript.
Availability of Data and Materials: The source code of the proposed the TopoEKF tool is currently hosted in the following repository: https://github.com/Rk1coder/TopoEKF.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest.
Appendix A









References
1. Guo D, Yang Q, Zhang YD, Zhang G, Zhu M, Yuan J. Adaptive object tracking discriminate model for multi-camera panorama surveillance in airport Apron. Comput Model Eng Sci. 2021;129(1):191–205. doi:10.32604/cmes.2021.016347. [Google Scholar] [CrossRef]
2. Liu S, Li X, Lu H, He Y. Multi-object tracking meets moving UAV. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ, USA: IEEE; 2022. p. 8876–85. [Google Scholar]
3. Wang H, Liu J, Dong H, Shao Z. A survey of the multi-sensor fusion object detection task in autonomous driving. Sensors. 2025;25(9):2794. doi:10.3390/s25092794. [Google Scholar] [PubMed] [CrossRef]
4. Tian F, Guo X, Fu W. Target tracking algorithm based on adaptive strong tracking extended Kalman filter. Electronics. 2024;13(3):652. doi:10.3390/electronics13030652. [Google Scholar] [CrossRef]
5. Kıratlı R, Eroğlu A. Real-time multi-object detection and tracking in UAV systems: improved YOLOv11-EFAC and optimized tracking algorithms. J Real Time Image Process. 2025;22(5):178. doi:10.1007/s11554-025-01758-z. [Google Scholar] [CrossRef]
6. Jing J, Ding L, Yang X, Feng X, Guan J, Han H, et al. Topology-informed deep learning for pavement crack detection: preserving consistent crack structure and connectivity. Autom Constr. 2025;174:106120. [Google Scholar]
7. Chazal F, Levrard C, Royer M. Topological analysis for detecting anomalies in dependent sequences: application to time series. J Mach Learn Res. 2024;25(365):1–49. [Google Scholar]
8. Esteve M, Falcó A. tramoTDA: a trajectory monitoring system using topological data analysis. SoftwareX. 2024;28:101953. [Google Scholar]
9. Elhamdadi H, Canavan S, Rosen P. AffectiveTDA: using topological data analysis to improve analysis and explainability in affective computing. IEEE Trans Vis Comput Graph. 2022;28(1):769–79. doi:10.1109/TVCG.2021.3114784. [Google Scholar] [PubMed] [CrossRef]
10. Eroglu A, Unlu Eroglu H. Topological data analysis for intelligent systems and applications. In: Kocer SO, editor. Artificial Intelligence Applications in Intelligent Systems. Konya, Türkiye: ISRES Publishing; 2023. p. 27–60. [Google Scholar]
11. Chazal F, Michel B. An introduction to topological data analysis: fundamental and practical aspects for data scientists. Front Artif Intell. 2021;4:667963. [Google Scholar] [PubMed]
12. Bar-Shalom Y, Li XR, Kirubarajan T. Estimation with applications to tracking and navigation: theory, algorithms and software. Hoboken, NJ, USA: John Wiley & Sons; 2001. [Google Scholar]
13. Fortmann T, Bar-Shalom Y, Scheffe M. Sonar tracking of multiple targets using joint probabilistic data association. IEEE J Oceanic Eng. 2003;8(3):173–84. doi:10.1109/joe.1983.1145560. [Google Scholar] [CrossRef]
14. Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP). Piscataway, NJ, USA: IEEE; 2017. p. 3645–9. [Google Scholar]
15. Zhang Y, Sun P, Jiang Y, Yu D, Weng F, Yuan Z, et al. ByteTrack: multi-object tracking by-associating every detection box. In: Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T, editors. Computer Vision—ECCV 2022. Cham, Switzerland: Springer Nature; 2022. p. 1–21. [Google Scholar]
16. Li S, Yang Y, Zeng D, Wang X. Adaptive and background-aware vision transformer for real-time UAV tracking. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, NJ, USA: IEEE; 2023. p. 13943–54. [Google Scholar]
17. Wu Y, Wang X, Yang X, Liu M, Zeng D, Ye H, et al. Learning occlusion-robust vision transformers for real-time UAV tracking. In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ, USA: IEEE; 2025. p. 17103–13. [Google Scholar]
18. Zhong P, Wang X, Zeng D, Zhou Q, He F, Li S. SMTrack: end-to-end trained spiking neural networks for multi-object tracking in RGB videos. IEEE Internet Things J. 2026;13(9):18797–806. doi:10.1109/jiot.2026.3662378. [Google Scholar] [CrossRef]
19. Wu Y, Li Y, Liu M, Wang X, Yang X, Ye H, et al. Learning an adaptive and view-invariant vision transformer for real-time UAV tracking. IEEE Trans Circuits Syst Video Technol. 2026;36(2):2403–18. doi:10.1109/tcsvt.2025.3599856. [Google Scholar] [CrossRef]
20. Wei Q, Zeng B, Liu J, He L, Zeng G. LiteTrack: layer pruning with asynchronous feature extraction for lightweight and efficient visual tracking. In: 2024 IEEE International Conference on Robotics and Automation (ICRA). Piscataway, NJ, USA: IEEE; 2024. p. 4968–75. [Google Scholar]
21. Kalman RE. A new approach to linear filtering and prediction problems. J Basic Eng. 1960;82(1):35–45. doi:10.1115/1.3662552. [Google Scholar] [CrossRef]
22. Aharon N, Orfaig R, Bobrovsky BZ. BoT-SORT: robust associations multi-pedestrian tracking. arXiv:2206.14651. 2022. [Google Scholar]
23. Du Y, Zhao Z, Song Y, Zhao Y, Su F, Gong T, et al. StrongSORT: make DeepSORT great again. IEEE Trans Multimed. 2023;25:8725–37. [Google Scholar]
24. Mueller M, Smith N, Ghanem B. A benchmark and simulator for UAV tracking. In: European Conference on Computer Vision. Cham, Switzerland: Springer; 2016. p. 445–61. [Google Scholar]
25. Cao Z, Fu C, Ye J, Li B, Li Y. SiamAPN++: s. Siamese attentional aggregation network for real-time UAV tracking. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway, NJ, USA: IEEE; 2021. p. 3086–92. [Google Scholar]
26. Henriques JF, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell. 2014;37(3):583–96. doi:10.1109/tpami.2014.2345390. [Google Scholar] [PubMed] [CrossRef]
27. Kayhani N, Heins A, Zhao W, Nahangi M, McCabe B, Schoelligb AP. Improved tag-based indoor localization of UAVs using extended Kalman filter. In: 36th International Symposium on Automation and Robotics in Construction (ISARC 2019); 2019 May 21–24; Banff, AB, Canada. 2019. p. 21–4. [Google Scholar]
28. Kim T, Park TH. Extended Kalman filter (EKF) design for vehicle position tracking using reliability function of radar and lidar. Sensors. 2020;20(15):4126. doi:10.3390/s20154126. [Google Scholar] [PubMed] [CrossRef]
29. Eckenhoff K, Geneva P, Merrill N, Huang G. Schmidt-EKF-based visual-inertial moving object tracking. In: 2020 IEEE International Conference on Robotics and Automation (ICRA). Piscataway, NJ, USA: IEEE; 2020. p. 651–7. [Google Scholar]
30. Piga NA, Pattacini U, Natale L. A differentiable extended Kalman filter for object tracking under sliding regime. Front Rob AI. 2021;8:686447. doi:10.3389/frobt.2021.686447. [Google Scholar] [PubMed] [CrossRef]
31. Julier SJ, Uhlmann JK. Unscented filtering and nonlinear estimation. Proc IEEE. 2004;92(3):401–22. doi:10.1109/jproc.2003.823141. [Google Scholar] [CrossRef]
32. Zhang G, Yin J, Deng P, Sun Y, Zhou L, Zhang K. Achieving adaptive visual multi-object tracking with unscented Kalman filter. Sensors. 2022;22(23):9106. doi:10.3390/s22239106. [Google Scholar] [PubMed] [CrossRef]
33. Mohamed A, Schwarz K. Adaptive Kalman filtering for INS/GPS. J Geodesy. 1999;73(4):193–203. doi:10.1007/s001900050236. [Google Scholar] [CrossRef]
34. Li J, Xu X, Jiang Z, Jiang B. Adaptive Kalman filter for real-time visual object tracking based on autocovariance least square estimation. Appl Sci. 2024;14(3):1045. doi:10.3390/app14031045. [Google Scholar] [CrossRef]
35. Jung H, Kang S, Kim T, Kim H. ConfTrack: Kalman filter-based multi-person tracking by utilizing confidence score of detection box. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway, NJ, USA: IEEE; 2024. p. 6583–92. [Google Scholar]
36. Fiyad HMN, Metwally HMB, El-Hameed M, Abozied M. Improved real time target tracking system based on cam-shift and Kalman filtering techniques. J Appl Res Technol. 2023;21(2):297–308. doi:10.22201/icat.24486736e.2023.21.2.1565. [Google Scholar] [CrossRef]
37. Malinowski M, Kwiecień J. Study of the effectiveness of different Kalman filtering methods and smoothers in object tracking based on simulation tests. Rep Geodesy Geoinform. 2014;97(1):1–22. doi:10.2478/rgg-2014-0008. [Google Scholar] [CrossRef]
38. Mures OA, Taibo J, Padrón EJ, Iglesias-Guitian JA. PlayNet: real-time handball play classification with Kalman embeddings and neural networks. Vis Comput. 2024;40(4):2695–711. doi:10.1007/s00371-023-02972-1. [Google Scholar] [CrossRef]
39. Monkam GF, De Lucia MJ, Bastian ND. A topological data analysis approach for detecting data poisoning attacks against machine learning based network intrusion detection systems. Comput Secur. 2024;144:103929. doi:10.2139/ssrn.4651812. [Google Scholar] [CrossRef]
40. Razmarashooli A, Chua YK, Barzegar V, Salazar D, Laflamme S, Hu C, et al. Real-time state estimation of nonstationary systems through dominant fundamental frequency using topological data analysis features. Mech Syst Signal Process. 2025;224(2):112048. doi:10.1016/j.ymssp.2024.112048. [Google Scholar] [CrossRef]
41. Bois A, Tervil B, Oudre L. Topological data analysis for unsupervised anomaly detection in time series. In: 2024 32nd European Signal Processing Conference (EUSIPCO). Piscataway, NJ, USA: IEEE; 2024. p. 1197–201. [Google Scholar]
42. Weber ES, Harding SN, Przybylski L. Detecting traffic incidents using persistence diagrams. Algorithms. 2020;13(9):222. doi:10.3390/a13090222. [Google Scholar] [CrossRef]
43. Esteve M, Falcó A. Trajectory classification through topological data analysis perspectives. IEEE Access. 2025;13:32458–69. [Google Scholar]
44. Indah D, Mwakalonge J, Comert G, Siuhi S, Musau H, Osei E, et al. Topological data analysis for driver behavior classification driven by vehicle trajectory data. Mach Learn Appl. 2025;21(1):100719. doi:10.1016/j.mlwa.2025.100719. [Google Scholar] [CrossRef]
45. Barberi LAA, Cave LMD. Topological data analysis for unsupervised anomaly detection and customer segmentation on banking data. arXiv:2508.14136. 2025. [Google Scholar]
46. Pradhan T, Athukuri J, Surendar A, Rajan C. Topological methods in machine learning and data analysis: a mathematical perspective. Panamerican Math J. 2025;35(2):758–71. doi:10.52783/pmj.v35.i2s.3340. [Google Scholar] [CrossRef]
47. Tian Y, Ye Q, Doermann D. YOLOv12: attention-centric real-time object detectors. arXiv:2502.12524. 2025. [Google Scholar]
48. Jocher G, Chaurasia A, Qiu J. Ultralytics YOLOv8. 2023 [cited 2026 May 20]. Available from: https://github.com/ultralytics/ultralytics. [Google Scholar]
49. Kıratlı R, Eroğlu A. Mathematical modeling and evaluation of extended Kalman filter-based multi-object tracking for UAV applications. In: The 1st International Conference on Mathematics and Applied Data Science (ICMADS’25); 2025 Aug 29–31; Konya, Türkiye. [Google Scholar]
50. Adams H, Emerson T, Kirby M, Neville R, Peterson C, Shipman P, et al. Persistence images: a stable vector representation of persistent homology. J Mach Learn Res. 2017;18(8):1–35. [Google Scholar]
51. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, et al. Microsoft coco: common objects in context. In: European Conference on Computer Vision. Cham, Switzerland: Springer; 2014. p. 740–55. [Google Scholar]
52. Du J. Understanding of object detection based on CNN family and YOLO. J Phys Conf Ser. 2018;1004:012029. doi:10.1088/1742-6596/1004/1/012029. [Google Scholar] [CrossRef]
53. Fan H, Du D, Wen L, Zhu P, Hu Q, Ling H, et al. VisDrone-MOT2020: the vision meets drone multiple object tracking challenge results. In: Bartoli A, Fusiello A, editor. Computer Vision—ECCV, 2020 Workshops. Cham, Switzerland: Springer International Publishing; 2020. p. 713–27. [Google Scholar]
54. Yu H, Li G, Zhang W, Huang Q, Du D, Tian Q, et al. The unmanned aerial vehicle benchmark: object detection, tracking and baseline. Intl J Comput Vis. 2020;128(5):1141–59. doi:10.1007/s11263-019-01266-1. [Google Scholar] [CrossRef]
55. Natha S. Comprehensive dataset for detecting road anomalies in diverse real-world situations. Zenodo. 2024. doi:10.5281/zenodo.13832363. [Google Scholar] [CrossRef]
56. Yao Y, Wang X, Xu M, Pu Z, Wang Y, Atkins E, et al. DoTA: unsupervised detection of traffic anomaly in driving videos. IEEE Trans Pattern Anal Mach Intell. 2023;45(1):444–59. doi:10.1109/tpami.2022.3150763. [Google Scholar] [PubMed] [CrossRef]
57. Chandrashekhar A, Satyanarayana B, Gorrepati RR, Vasanthi P. An efficient YOLOv12-based framework for detecting extremely small-scale objects. Sci Rep. 2025;16(1):2062. doi:10.1038/s41598-025-31803-7. [Google Scholar] [PubMed] [CrossRef]
58. Deluxni N, Sudhakaran P. Underwater debris detection using YOLOv12 with enhanced feature extraction using R-ELAN and FlashAttention network. Results Eng. 2025;28(15):107282. doi:10.1016/j.rineng.2025.107282. [Google Scholar] [CrossRef]
59. Song J, Wang Z, Liu Q, He X. Remote state estimation for nonlinear systems under compression-decompression mechanism: a modified unscented Kalman filtering approach. IEEE Trans Autom Control. 2026;71(1):91–106. doi:10.1109/tac.2025.3589276. [Google Scholar] [CrossRef]
60. Mcdougall RJ, Godsill SJ. Target tracking using a time-varying autoregressive dynamic model. IEEE Open J Signal Process. 2025;6:147–55. doi:10.1109/ojsp.2025.3528896. [Google Scholar] [CrossRef]
61. Teiko Teye M, Maoz O, Rottmann M. FutrTrack: a camera-LiDAR fusion transformer for 3D multiple object tracking. arXiv:2510.19981. 2025. [Google Scholar]
62. Dong H, Tuo H, Wang L, Zhou H. MonoFHD: leveraging flight height data for UAV monocular 3D object detection. Aerospace Syst. 2026;33:8851. doi:10.1007/s42401-025-00437-y. [Google Scholar] [CrossRef]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools