Open Access
ARTICLE
PIF-Identifier: Accurate Low-Overhead Identification of Persistent Infrequent Flows in Network Traffic
1 School of Computer Science and Technology, Changsha University of Science and Technology, Changsha, China
2 School of Information Science and Engineering, Changsha Normal University, Changsha, China
* Corresponding Author: Jinyuan Zhao. Email:
Computers, Materials & Continua 2026, 88(1), 25 https://doi.org/10.32604/cmc.2026.078464
Received 31 December 2025; Accepted 28 February 2026; Issue published 08 May 2026
Abstract
Persistent Infrequent Flows (PIFs) refer to the packet flows that last for a long time but always at low frequencies in network traffic. Accurate identification of the PIFs plays a vital role in intrusion detection, attack prevention, traffic engineering, and other network fields. However, existing methods often require to save all flows for finding out the PIFs due to their infrequency feature, which brings about the problem of low identification accuracy and high memory overhead. To solve this problem, this paper proposes an accurate PIF identification method with low overhead called PIF-Identifier, composed of a new-flow discriminator and a PIF tracker. Specifically, we first design a compact new-flow discriminator by applying probabilistic data structures, to quickly determine whether a packet flow arrives for the first time within current time window. Then we design a PIF tracker to accurately identify and report persistent infrequent flows. In the PIF tracker, we configure a small-size frequency counter for each tracked flow in accordance with the frequency threshold of the PIF, without sacrificing the accuracy of PIF identification. Furthermore, we design a probabilistic replacement strategy based on the number of time windows of flow persistence, to accommodate newly arrived potential PIFs when there is no vacancy in their mapped buckets of the PIF tracker. Finally, we evaluate the performance of our proposed PIF-Identifier by theoretical analysis and experimental verification with real network traffic traces. Experimental results indicate that the PIF-Identifier achieves the precision of 100%, much higher recall rate and F1 score, as well as lower average relative error than the state-of-the-art methods, significantly promoting the identification performance of persistent infrequent flows.Keywords
Network traffic measurement provides key information for a variety of network management tasks, including traffic behavior analysis, QoS (quality of service), performance diagnosis, and anomaly detection, etc. [1,2]. It has become an indispensable method to effectively understand and manage network performance. Until now, a popular measurement method is to apply the compact data structure Sketch to perform approximate estimation tasks in network measurement, such as elephant flow identification [3], flow frequency estimation [4]. In recent years, researchers have begun to focus on other important characteristics of packet flows, such as liveness [5,6], persistence [7], and bursty [8], steadiness [9] and persistent infrequence [10]. Persistent Infrequent Flows (PIFs) refer to the flows where packets arrive persistently but at low frequency. The identification of PIFs has widespread applications in the field of network security, such as the detection of advanced persistent threats (APT) [11,12] and low-rate DDoS (distributed denial of service) attacks [13,14]. Therefore, it is crucial to efficiently identify and accurately report PIFs for supporting intrusion detection, attack prevention, traffic engineering, and other fields.
A straightforward PIF identification approach is to find the intersection of persistent flows and infrequent ones. Inspired by elephant flow identification, many researchers usually measure persistent flows by applying sketched-based methods, such as PIE [15] and On-Off Sketch [16], for low memory overhead. Meanwhile, existing methods estimate flow frequency usually by improving Count-Min (CM) Sketch [17], such as Elastic Sketch [18] and Augmented Sketch [19]. These methods primarily focus on the identification of frequent flows rather than infrequent ones. Up to now, PISketch [10] is the first method that combines and optimizes the above two tasks, and has achieved good performance in identifying PIFs. However, this method has to allocate large bits for its weight counters, to accommodate varying number of packets in various flows, resulting in excessive memory overhead. In addition, the method merely outputs PIFs at the end of whole measurement process, which can not report the specific time windows of flow persistence, and may misjudge frequent flows as infrequent ones. To address above problems, this paper propose a low-overhead identification method of persistent infrequent flows called PIF-Identifier, with main contributions summarized as follows:
• Proposing a low-overhead identification method of persistent infrequent flows called PIF-identifier, which designs a compact new-flow discriminator to quickly determines whether a flow appears for the first time in current time window, and a PIF tracker to track potential PIFs and report real ones.
• Designing a minimal-size configuration strategy of flow frequency counters in the PIF tracker, which configures a minimal size for each counter based on the frequency threshold of infrequent flows, reducing the memory overhead of PIF-identifier.
• Devising a probabilistic replacement strategy to accommodate newly arrived potential PIFs when there is no vacancy in their mapped buckets, enhancing the identification accuracy of persistent infrequent flows.
• Proving that PIF-Identifier has no false positive error and deriving an upper bound on its probability of false negative error, for confirming its superior performance on identifying persistent infrequent flows.
Sketch-based data structures are widely used for flow frequency estimation, achieving high accuracy and speed with limited memory. The most typical Sketch algorithm is CM Sketch [17] which increments all mapping counters of arriving packet by one, and returns the minimum of these counters as the estimated size of the flow to which the packet belongs. To improve accuracy, Conservative Update(CU) Sketch [20] increments only the smallest mapped counter, but frequent hash collisions can still lead to overestimation. To address the inherent skewness of network traffic, researchers have conducted various research on data structures and statistical strategies. For example, Cold Filter [21] employs a two-layer filter to pre-filter a large number of small flows, reducing hash collisions and improving estimation accuracy. HeavyGuardian [22] adopts an exponentially attenuated decay strategy to aggressively purge small flows, enhancing the frequency estimation of large ones. Meanwhile, Elastic Sketch [18] separates large and small flows into heavy and light parts and utilizes voting strategy to expel small flows to light part in time, achieving high frequency estimation accuracy of large flows and small ones.
Several Sketch-based algorithms have been developed for flow persistence estimation. Small Space [23] adopts the idea of “sampling and counting” to track the persistence of flows by maintaining a hash table. Although sampling reduces space overhead, Small Space still records many non-persistent items, resulting in suboptimal memory efficient. To address this issue, PIE [15] encodes each flow ID into multiple Raptor code segments and randomly stores a subset in each space-time Bloom filter to reduce memory overhead. However, its memory requirements increase with the growing number of time windows, and high decoding complexity prevents line-rate processing. On-off Sketch [16] sets a switch flag for each counter to control whether each mapped counter increases its value, ensuring that each counter increases by at most once per time window. Although it avoids underestimation, hash conflicts can still lead to overestimation of flow persistence. On this basis, P-Sketch [24] introduces the concept of flow heat and proposes a probabilistic replacement algorithm based on the number of duration window and heat. This strategy prioritizes the retention of highly persistent flows, improving identification accuracy of persistent flows.
The above mentioned methods have achieved good performance in estimating flow frequency and persistence separately, but none of them can directly identify persistent infrequent flows. As a representative solution of identifying PIFs, PISketch [10] defines a flow weight that comprehensively reflects the persistence and frequency of each flow. The weight is adjusted through a reward and punishment mechanism, and the flows with higher weight are reported as PIFs. However, weight counters often require a large number of bits to accommodate flows with various sizes, resulting in excessive memory overhead. At the same time, PISketch adopts measurement results in all time windows for overall judgment, which ignores interrupt arrivals and high-frequency phenomenon in some time windows, resulting in misjudgments of PIFs. In addition, it can only report the number of time windows with flow persistence for a PIF, but not specific starting and ending time windows.
Different from the aforementioned solutions, this paper further clarifies the criterion of PIF as a certain number of consecutive time windows with infrequent packet arrivals. According to this criterion, we limit the size of frequency counter for each potential PIF in terms of its frequency threshold. If the frequency of a potential PIF reaches the threshold, we will not increase its frequency counter as it no longer satisfies infrequency characteristics. Furthermore, we record starting and ending time windows to obtain the number of consecutive time windows, for each potential PIF maintaining infrequency. Subsequently, we design a probabilistic replacement strategy based on the number of consecutive time windows, to accommodate new potential PIFs when there is no vacancy in their mapped buckets. With these design concepts, we propose a low-overhead method that accurately identifies persistent infrequent flows called PIF-Identifier. Fig. 1 illustrates the data structure of our proposed PIF-Identifier, comprised of a new-flow discriminator and a PIF tracker. Table 1 shows the symbols commonly used in the PIF-Identifier and their meanings.

Figure 1: The data structure of PIF-Identifier [1,2].

As shown in Fig. 1, the new-flow discriminator is a Bloom filter consisting of L counters. For each arrived packet, it is mapped to
The PIF tracker is a hash table composed of M hash buckets
Packet processing: When a packet
Case 1: The flow
Case 2: The flow
Case 3: The flow
Case 4: The flow
Case 5: The flow
Examples: Fig. 2 shows some examples of packet processing. Suppose the frequency threshold

Figure 2: The examples of flow insertion [1–3].
Clearing and reporting: At the end of each time window, we check each stored flow in the PIF tracker. As for the flow
Theorem 1: Given that measured network traffic consists of
We first analyze the packet processing of the PIF-Identifier. It can be inferred from Section 3.3 that: (1) The PIF-Identifier does not generate false negative errors in Case1 and Case3, because the new-flow discriminator reports negative and Bloom filters do not have false negative errors. (2) In Case2, we will overestimate the frequency of a real PIF if there is a false positive error for the new-flow discriminator. (3) In Case4, false negative errors can also occur in the PIF-Identifier when flow replacement operations evict real PIFs. (4) In Case5, the PIF-Identifier does not produce a false positive error, and its false negative error rate is negligible because there is still a chance for a real PIF to perform a flow replacement in current time window.
As seen from the above analysis, the PIF-Identifier has no false positive errors, while false negative errors are mainly generated in Case2 and Case4. Then we analyze the probability of false negative errors in these two cases. Suppose the new-flow discriminator has L counters initialized to 0, we can express the probability that a counter in the new-flow discriminator is not mapped by a packet through
Therefore, we can derive the false positive error rate of the new-flow discriminator as:
Since there are
As for Case4, we can obtain the probability of a PIF being incorrectly replaced as
In summary, we can infer the false negative error rate of PIF-Identifier
Time complexity analysis: For each arrived packet, the new-flow discriminator needs to perform
Datasets: We select three high-speed network traffic traces TRACE20130222, TRACE20141109 and TRACE20180308 released by Jiangsu Province Key Laboratory of Computer Network Technology as experimental datasets. The three traces were collected from the Jiangsu Province border 10 Gbps backbone link of China Education and Research Network on 22 February 2013, 09 November 2014 and 06 December 2018, respectively. Specifically, the TRACE20130222, TRACE20141109 and TRACE20180308 have a total of 2,991,496, 3,064,040 and 3,063,134 packets respectively, belonging to 419,447, 302,279 and 212,782 flows. As for each packet in the dataset, we extract its timestamp, and flow ID composed of 5 tuples (source IP address and port number, destination IP address and port number, and protocol type).
Performance metrics:
1) Precision: The proportion of true PIFs in all PIFs reported by each PIF identification method.
2) Recall: The proportion of all true PIFs correctly reported by each PIF identification method.
3) F1 Score:
4) Average Relative Error (ARE):
Implementation: We implement PISketch, On-Off/CM Sketch and our proposed PIF-Identifier by C++ programming in this paper, and conduct comparative experiments on a physical machine with dual 6-core CPU (24 threads, Intel Xeon Silver 4214R @2.4 GHz) and 32 GB DRAM. For the PISketch, memory is allocated for its Bloom filter and weight Sketch with the ratio 1:9. The Bloom filter adopts three hash functions, while the weight Sketch sets the number of cells per bucket to 5. As for the On-Off/CM Sketch, we combine an On-Off Sketch for estimating flow persistence and a CM Sketch for estimating flow frequency to identify PIFs. Specifically, memory is allocated for them with the ratio 1:3. Meanwhile, we set the number of rows in the CM Sketch and the number of cells in each bucket of the On-Off Sketch both to 3. As for our proposed PIF-Identifier, memory is split between the new-flow discriminator and the PIF tracker with the ratio 1:9. The new-flow discriminator employs three hash functions, and the PIF tracker is configured with 5 cells per bucket. In our experiments, we set the frequency threshold
With the above configuration, we perform each PIF identification method on three datasets. Each method reports a set of PIFs, along with their measured number of consecutive time windows. We obtain the number of correctly identified PIFs by verifying reported results with the ground truth. Finally, we compute the performance metrics of each PIF identification method under identical memory constraints.
This subsection evaluates the impact of key parameters for PIF-Identifier on its performance. The parameters include the frequency threshold
Impact of the number of cells per bucket N: We vary N from 5 to 25 while holding other parameters constant. Fig. 3 shows the effect of varying N on the performance of PIF-Identifier. As seen from Fig. 3, the F1 score of PIF-Identifier achieves the maximum value at N = 5 for different memory sizes. The F1 score of PIF-Identifier shows a downward trend overall as N increases, while its ARE is stable overall. This is because the number of buckets in PIF-Identifier decreases as the number of cells per bucket increases. At this moment, the probability of a hash collision increases when PIFs are mapped to an identical bucket, leading to performance degradation. Therefore, we set N to 5.

Figure 3: Effect of N. (a) F1 Score. (b) Average Relative Error(ARE).
Impact of the memory allocation ratio of the new-flow discriminator to the PIF tracker: We vary the proportion of memory allocated to the new-flow discriminator while keeping other parameters fixed. Fig. 4 shows the effect of memory allocation proportion of the new-flow discriminator on the performance of the PIF-Identifier. As shown in Fig. 4, the F1 score of the PIF-Identifier gradually decreases as the memory allocation proportion of the new-flow discriminator increases, when the total memory is 100 KB. However, the F1 score of PIF-Identifier first increases and then decreases, with the increasing memory allocation proportion of the new-flow discriminator, for the total memory of 150 KB and 200 KB. This is because the increase of the memory allocation proportion of the new-flow discriminator means the reduction on the number of buckets in the PIF tracker. This will increase the hash collisions and reduce the measurement accuracy of the PIF tracker. In summary, we set the memory allocation ratio of the new-flow discriminator to the PIF tracker as 1:9 in consideration of memory constraint.

Figure 4: Effect of memory allocation. (a) F1 Score. (b) ARE.
Impact of the frequency threshold

Figure 5: Effect of
Impact of the number threshold of consecutive time windows

Figure 6: Effect of
Advanced Persistent Threats (APTs) [25] typically maintain stealth command and control channels, by low-frequency periodic beaconing with the intervals ranging from tens of seconds to several hours. This results in extremely low packet rates often below one packet per second (pps), while the channels may persist for weeks or even months. In our experiments, setting
4.3 Performance of PIF Identification
In this subsection, we conduct extensive experiments on three datasets to compare the performance of PIF-Identifier with existing PIF identification methods. We evaluate the performance of each method by precision, recall rate, F1 score and ARE. In the experiments, we increase the memory capacity of each method from 100 to 300 KB while keeping other parameters constant.
Precision: Fig. 7 illustrates the precision of each PIF identification method on different datasets. As seen from Fig. 7, PIF-Identifier consistently achieves higher precision than other existing methods for identical memory capacity. Specifically, the precision of PIF-Identifier consistently achieves 100% under different memory capacity on three datasets, which aligns with its zero false-positive rate by theoretical proof in Section 4. In contrast, the precision of existing methods depends on their memory capacity. Specifically, the precision of PISketch decreases to 81% and 63.8% in the dataset 20141108, respectively for the memory capacity 100 and 300 KB. On-Off/CM Sketch has the lowest precision in three datasets and its precision gradually decreases with the increasing memory capacity. Taking the dataset 20130222 for example, its precision decreases to 60% and 43.5%, respectively for the memory capacity 100 and 300 KB. Overall, PIF-Identifier outperforms the other two methods, improving the precision by 9.5% over PISketch and 46.2% over On-Off/CM Sketch under identical memory capacity.

Figure 7: Precision. (a) TRACE20130222; (b) TRACE20141108; (c) TRACE20180308.
Recall: Fig. 8 shows the recall rate of each PIF identification method on different datasets. As shown in Fig. 8, the recall rate of each method exhibits a growth trend with the increase of memory capacity, while PIF-Identifier achieves highest recall rates on all three datasets. Specifically for the dataset 20130222, PIF-Identifier achieves a recall rate of 87.7% with the memory capacity 100 KB, while PISktech and On-Off/CM Sketch achieve recall rate of 32.3% and 59.9%, respectively. when the memory capacity is increased to 300 KB, their recall rates are 96.7%, 53.8%, and 51.5%, respectively. For the dataset 20141108, their recall rates decrease to 88.5%, 40.7%, and 19.8% respectively with the memory capacity 100 KB, and increase to 97.4%, 74.8%, and 51.3% respectively when the memory capacity is increased to 300 KB. In summary, our proposed PIF-Identifier achieves the highest recall rate with the promotion of 52.3% and 60.5%, respectively compared to PISketch and On-Off/CM Sketch.

Figure 8: Recall. (a) TRACE20130222; (b) TRACE20141108; (c) TRACE20180308.
F1 Score: Fig. 9 exhibits the F1 score of each PIF identification method on different datasets. As seen from Fig. 9, the F1 score of PIF-Identifier consistently outperform those of other methods on all three datasets, above 0.85 even with small memory capacity. Specifically, PIF-Identifier achieves a F1 score of 0.934 with the memory capacity 100 KB on the dataset 20130222, while PISketch and On-Off/CM Sketch only obtain 0.488 and 0.339, respectively. When the memory capacity is increased to 300 KB, the F1 score of PIF-Identifier reaches up to 0.983, while PISketch and On-Off/CM Sketch only obtain 0.699 and 0.471, respectively. On the 20180308 dataset, PIF-Identifier achieves a F1 score of 0.933 with the memory capacity 100 KB. When the memory capacity is 300 KB, PIF-Identifier achieves a F1 score of 0.979, while PISketch and On-Off/CM Sketch only obtain 0.294 and 0.381, respectively. In summary, PIF-Identifier achieves the highest F1 score on all three datasets, outperforming PISketch and On-Off/CM Sketch respectively by 46% and 60.1% on average.

Figure 9: F1 score. (a) TRACE20130222; (b) TRACE20141108; (c) TRACE20180308.
ARE: Fig. 10 illustrates the log10ARE value of each PIF identification method on different datasets. As seen from Fig. 10, the log10ARE value of each method decrease gradually with the increase of memory capacity, and PIF-Identifier achieves relatively lower log10ARE values on all three datasets. Taking the dataset 20180308 for example, the log10ARE value of PIF-Identifier is −1.521 for the memory capacity 100 KB, while PISketch and On-Off/CM Sketch is −0.212 and 0.945, respectively. When the memory capacity is increased to 300 KB, the log10ARE of PIF-Identifier is −1.804, while the log10ARE of PISketch and On-Off/CM Sketch is −0.213 and 0.455, respectively. Overall, the log10ARE of PIF-Identifier is 136 and 2374 times respectively lower than those of PISketch and On Off/CM Sketch on average.

Figure 10: ARE vs. memory size. (a) TRACE20130222; (b) TRACE20141108; (c) TRACE20180308.
4.4 Result Analysis and Discussion
The above experimental results demonstrate the superiority of PIF-Identifier in identifying PIFs. We next analyze the reasons for its better results than existing PIF identification methods.
On-Off/CM Sketch employs two data structures to track the persistence and low frequency. Furthermore, it adopts a simple replacement strategy to replace persistent flows, which leads many non-persistent flows to be misidentified as persistent ones, reducing the identification precision of PIF. At the same time, both structures need massive memory, and CM sketch has overestimation error. Therefore, each structure has a fewer number of counters in the case of identical memory capacity, leading to frequent hash collisions. Furthermore, infrequent flows may be mistakenly identified as frequent ones, resulting in the missed identification of PIFs. As a result, the On-Off/CM Sketch has poor performance in identifying PIFs, especially under low memory capacity.
PISketch designs a weight to integrate persistence with low frequency, significantly reducing memory overhead compared to On-Off/CM Sketch. Therefore, it achieves relatively better performance in the case of low memory capacity. However, it brings about limited performance improvement, because its weight counter still needs to allocate more bits to record packet flows with various sizes. At the same time, PISketch ignores interruptions and high frequency phenomenons of recorded flows, leading to the misidentification of non-PIFs. In addition, PISketch reports PIFs based on flow weights, and cannot promptly clear out the flows those do not meet the conditions of PIFs. Therefore, it may not be able to store real PIFs, resulting in the miss of PIF identification. In summary, PISketch achieves insufficient precision and relatively large error.
In our PIF definition, we limit the packet frequency of PIFs in each time window within
It is a crucial task to identify persistent infrequent flows in network traffic measurements, but existing identification methods suffer from high memory overhead and low accuracy. To solve this problem, this paper proposes a low-overhead method for accurate identification of persistent infrequent flows, called PIF-Identifier. It adopts the minimal-size configuration strategy of flow frequency counters to design a compact PIF tracker, which greatly reduces memory overhead. At the same time, it utilizes fast cleanup mechanism to clear out the flows that did not meet PIF definition, which effectively avoids the false positives of PIFs. Additionally, it employs probabilistic replacement strategy to make room for potential PIFs when there is no space in the PIF tracker, significantly enhancing the accuracy and efficiency of PIF identification.
We prove that the PIF-Identifier has superior performance on identifying persistent infrequent flows through mathematical analysis and comparative experiments. Experimental results demonstrate that PIF-Identifier outperforms existing methods with a precision of 100% and a recall rate of over 95%. Moreover, PIF-Identifier can achieve an F1 score higher than 0.9 even with the memory capacity 100 KB. At the same time, the error rate of PIF-Identifier is reduced by two orders of magnitude on average compared with existing methods. In summary, the PIF-identifier achieves high precision and memory efficiency in PIF identification, with good adaptability to network environments with limited memory resources.
In our future work, we will explore the pre-filtering strategy and adaptive capacity expansion mechanism for PIF-Identifier. Furthermore, we also intend to improve identification accuracy and adaptability to network traffic jitter and even malicious attacks. In the future, we also plan to deploy PIF-Identifier into various network applications, such as intrusion detection systems, network management platforms, etc.
Acknowledgement: The authors would like to thank Changsha University of Science and Technology.
Funding Statement: This work was supported in part by Hunan Provincial Natural Science Foundation of China (2023JJ30053, 2026JJ81174), Scientific Research Fund of Hunan Provincial Education Department (22A0232, 23A0735).
Author Contributions: Bing Xiong: Conceptualization, Writing—review and editing, Funding acquisition, Resources. Zhuoxiong Li: Formal analysis, Writing—original draft, Validation. Yongqing Liu: Methodology, Visualization. Yu Tang: Software, Data curation. Jinyuan Zhao: Project administration, Funding acquisition, Investigation, Supervision. All authors reviewed and approved the final version of the manuscript.
Availability of Data and Materials: Data is available upon request from the authors.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Zhou Z, Abawajy J. Reinforcement learning-based edge server placement in the intelligent internet of vehicles environment. IEEE Trans Intell Transp Syst. 2025. doi:10.1109/TITS.2025.3557259. [Google Scholar] [CrossRef]
2. Kirubavathi G, Pulliyasseri A, Rajesh A, Ajayan A, Alfarhood S, Safran M, et al. Enhancing IoT resilience at the edge: a resource-efficient framework for real-time anomaly detection in streaming data. Comput Model Eng Sci. 2025;143(3):3005–31. doi:10.32604/cmes.2025.065698. [Google Scholar] [CrossRef]
3. Ye J, Li L, Zhang W, Chen G, Shan Y, Li Y, et al. Ua-sketch: An accurate approach to detect heavy flow based on uninterrupted arrival. In: Proceedings of the 51st International Conference on Parallel Processing (ICPP ’22); 2022 Aug 29–Sep 1; Bordeaux, France. New York, NY, USA: ACM; 2022. p. 1–11. doi:10.1145/3545008.3545017. [Google Scholar] [CrossRef]
4. Shahout R, Sabek I, Mitzenmacher M. Learning-augmented frequency estimation in sliding windows. In: 2024 IEEE 32nd International Conference on Network Protocols (ICNP); 2024 Oct 28–31; Charleroi, Belgium. Piscataway, NJ, USA: IEEE; 2024. p. 1–6. doi:10.1109/ICNP61940.2024.10858536. [Google Scholar] [CrossRef]
5. Xiong B, Chang Y, Zhang Y, Zhang J, Zhao B, Li K. AF-detector: an accurate low-overhead method for detecting active flows in network traffic. Comput Netw. 2025;270(C):111562. doi:10.1016/j.comnet.2025.111562. [Google Scholar] [CrossRef]
6. Xiong B, Liu Y, Liu R, Zhao J, He S, Zhao B, et al. ActiveGuardian: an accurate and efficient algorithm for identifying active elephant flows in network traffic. J Netw Comput Appl. 2024;224:103853. doi:10.1016/jjnca.2024.103853. [Google Scholar] [CrossRef]
7. Xiao Q, Li Y, Wu Y. Finding recently persistent flows in high-speed packet streams based on cuckoo filter. Comput Netw. 2023;237:110097. doi:10.1016/j.comnet.2023.110097. [Google Scholar] [CrossRef]
8. Xiong B, Xiao Q, Huang S, Cao J, Zhao B, Li K. BurstDetector: an accurate low-overhead detection algorithm for burst flows in network traffic. In: Proceedings of the 2025 IEEE International Symposium on Parallel and Distributed Processing with Applications; 2025 Oct 10–12; Shenyang, China. Piscataway, NJ, USA: IEEE; 2025. p. 642–9. doi:10.1109/ISPA67752.2025.00088. [Google Scholar] [CrossRef]
9. Li X, Fan Z, Li H, Zhong Z, Guo J, Long S, et al. SteadySketch: finding steady flows in data streams. In: 2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS); 2023 Jun 19–21; Orlando, FL, USA. Piscataway, NJ, USA: IEEE; 2023. p. 1–9. doi:10.1109/IWQoS57198.2023.10188743. [Google Scholar] [CrossRef]
10. Fan Z, Hu Z, Wu Y, Guo J, Liu W, Yang T, et al. Pisketch: finding persistent and infrequent flows. In: Proceedings of the ACM SIGCOMM Workshop on Formal Foundations and Security of Programmable Network Infrastructures; 2022 Aug 22–26; Amsterdam, The Netherlands. New York, NY, USA: ACM; 2022. doi:10.1145/3528082.3544834. [Google Scholar] [CrossRef]
11. Alshamrani A, Myneni S, Chowdhary A, Huang D. A survey on advanced persistent threats: techniques, solutions, challenges, and research opportunities. IEEE Commun Surv Tutor. 2019;21(2):1851–77. doi:10.1109/COMST.2019.2891891. [Google Scholar] [CrossRef]
12. Khaleefa EJ, Abdulah DA. Concept and difficulties of advanced persistent threats (APTsurvey. Int J Nonlinear Anal Appl. 2022;13(1):4037–52. doi:10.22075/ijnaa.2022.6230. [Google Scholar] [CrossRef]
13. Xiao Q, Qiao Y, Zhen M, Chen S. Estimating the persistent spreads in high-speed networks. In: 2014 IEEE 22nd International Conference on Network Protocols; 2014 Oct 21–24; Raleigh, NC, USA. Piscataway, NJ, USA: IEEE; 2014. p. 131–42. doi:10.1109/ICNP.2014.33. [Google Scholar] [CrossRef]
14. Rios VD, Inácio PR, Magoni D, Freire MM. Detection and mitigation of low-rate denial-of-service attacks: a survey. IEEE Access. 2022;10:76648–68. doi:10.1109/ACCESS.2022.3191430. [Google Scholar] [CrossRef]
15. Dai H, Shahzad M, Liu AX, Li M, Zhong Y, Chen G. Identifying and estimating persistent items in data streams. IEEE/ACM Trans Netw. 2018;26(6):2429–42. doi:10.1109/TNET.2018.2865125. [Google Scholar] [CrossRef]
16. Zhang Y, Li J, Lei Y, Yang T, Li Z, Zhang G, et al. On-off sketch: a fast and accurate sketch on persistence. Proc VLDB Endow. 2020;14(2):128–40. doi:10.14778/3425879.3425884. [Google Scholar] [CrossRef]
17. Cormode G, Muthukrishnan S. An improved data stream summary: the count-min sketch and its applications. J Algor. 2005;55(1):58–75. doi:10.1016/j.jalgor.2003.12.001. [Google Scholar] [CrossRef]
18. Yang T, Jiang J, Liu P, Huang Q, Gong J, Zhou Y, et al. Elastic sketch: adaptive and fast network-wide measurements. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication; 2018 Aug 20–25; Budapest, Hungary. New York, NY, USA: ACM; 2018. p. 561–75. doi:10.1007/s11704-016-6053-x. [Google Scholar] [CrossRef]
19. Roy P, Khan A, Alonso G. Augmented sketch: faster and more accurate stream processing. In: SIGMOD ’16: Proceedings of the 2016 International Conference on Management of Data; 2016 Jun 26–Jul 1; San Francisco, CA, USA. New York, NY, USA: ACM; 2016. p. 1449–63. doi:10.1145/2882903.2882948. [Google Scholar] [CrossRef]
20. Estan C, Varghese G. New directions in traffic measurement and accounting. In: Proceedings of the 2002 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications; 2002 Aug 19–23; Pittsburgh, PA, USA. New York, NY, USA: ACM; 2002. p. 323–36. doi:10.1145/633025.633056. [Google Scholar] [CrossRef]
21. Zhou Y, Yang T, Jiang J, Cui B, Yu M, Li X, et al. Cold filter: a meta-framework for faster and more accurate stream processing. In: Proceedings of the 2018 International Conference on Management of Data; 2018 Jun 10–15; Houston, TX, USA. New York, NY, USA: ACM; 2018. p. 741–56. doi:10.1145/3183713.3183726. [Google Scholar] [CrossRef]
22. Yang T, Gong J, Zhang H, Zou L, Shi L, Li X. Heavyguardian: separate and guard hot items in data streams. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2018 Aug 19–23; London, UK. New York, NY, USA: ACM; 2018. p. 2584–93. doi:10.1145/3219819.3219978. [Google Scholar] [CrossRef]
23. Lahiri B, Chandrashekar J, Tirthapura S. Space-efficient tracking of persistent items in a massive data stream. In: Proceedings of the 5th ACM International Conference on Distributed Event-Based System; 2011 Jul 11–15; New York, NY, USA. New York, NY, USA: ACM; 2011. p. 255–66. doi:10.1145/2002259.2002294. [Google Scholar] [CrossRef]
24. Li W, Patras P. P-sketch: a fast and accurate sketch for persistent item lookup. IEEE/ACM Trans Netw. 2023;32(2):987–1002. doi:10.1109/TNET.2023.3306897. [Google Scholar] [CrossRef]
25. Sharma A, Gupta BB, Singh AK, Saraswat VK. Advanced persistent threats (APTevolution, anatomy, attribution and countermeasures. J Ambient Intellig Human Comput. 2023;14(7):9355–81. [Google Scholar]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools