A Hybrid Self-Supervised Learning Framework for Advanced Persistent Threat Detection

Marwan Ali Albahar^*
Department of Computing, College of Engineering and Computing in Al-Lith, Umm Al-Qura University, Makkah, Saudi Arabia
* Corresponding Author: Marwan Ali Albahar. Email: email
(This article belongs to the Special Issue: Cyber Attack Detection in Cyber-Physical Systems)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.079941

Received 31 January 2026; Accepted 13 April 2026; Published online 27 April 2026

Download PDF

Abstract

Advanced Persistent Threats (APTs) are stealthy cyberattacks that can evade detection in system-level audit logs. Provenance graphs encode these logs as interacting entities and events, exposing a causal and dependency structure that is often obscured in linear representations. Prior provenance-based detectors typically apply anomaly detection over such graphs, yet they frequently incur high false-positive rates and produce coarse grained alerts; moreover, approaches that heavily depend on node-specific identifiers (e.g., file paths) can learn spurious correlations, reducing robustness and limiting reliability across heterogeneous workloads. In this paper, we present Self-Training Adaptive Graph Encoder (stage), a lightweight, self-supervised anomaly detection framework for provenance graphs that (i) trains without attack labels and (ii) enforces leakage-free model selection and thresholding with explicit control over false-alarm rates. STAGE uses learnable degree and node-type embeddings, processed by a compact two-layer Graph Convolutional Networks (GCN) with residual connections and dual pooling. A memory augmented attention module captures global benign prototypes, improving resilience to rare-but-legitimate behaviors, and suppressing false alarms. Training combines contrastive learning over augmented graph views with a one-class Support Vector Data Description (SVDD) objective that learns a compact benign hypersphere in the embedding space. Inference, STAGE fuses neural embeddings with fixed dimensional structural graph statistics and scores them using an ensemble of classical one-class detectors. As a result, STAGE attains strong ranking quality and practical operating points on two benchmarks: the StreamSpot and Wget datasets. In the StreamSpot dataset, STAGE achieves an AUC of 0.998, operating at 95% recall with a 0% false positive rate. On the Wget dataset, it attains an AUC of 0.998 and an average precision of 0.998, achieving 100% recall and 96% precision at a 4% false positive rate. Overall, STAGE demonstrates strong empirical separability for benign-only provenance-based detection and provides an explicit mechanism to trade off recall and false positive rate through predefined thresholding policies.

Keywords

Provenance graphs; advanced persistent threats; benign-only anomaly detection; self-supervised learning; false positive control

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

80

View
17

Download
0

Like

Attack Behavior Extraction Based on Heterogeneous Cyberthreat Intelligence and Graph Convolutional Networks
Binhui Tang, Junfeng Wang, Huanran...
An Effective Threat Detection Framework for Advanced Persistent Cyberattacks
So-Eun Jeon, Sun-Jin Lee, Eun-Young...
Detecting APT-Exploited Processes through Semantic Fusion and Interaction Prediction
Bin Luo, Liangguo Chen, Shuhua...
A Comprehensive Survey on Advanced Persistent Threat (APT) Detection Techniques
Singamaneni Krishnapriya, Sukhvinder...
A Facial Expression Recognition Method Integrating Uncertainty Estimation and Active Learning
Yujian Wang, Jianxun Zhang, Renhao...

All issues

Online First

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

A Hybrid Self-Supervised Learning Framework for Advanced Persistent Threat Detection

Abstract

Keywords

80

17

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link