Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.073798
Special Issues
Table of Content

Open Access

ARTICLE

AFI: Blackbox Backdoor Detection Method Based on Adaptive Feature Injection

Simin Tang1,2,3,4, Zhiyong Zhang1,2,3,4,*, Junyan Pan1,2,3,4, Gaoyuan Quan1,2,3,4, Weiguo Wang5, Junchang Jing6
1 Information Engineering College, Henan University of Science and Technology, Luoyang, 471023, China
2 Henan International Joint Laboratory of Cyberspace Security Applications, Henan University of Science and Technology, Luoyang, 471023, China
3 Henan Intelligent Manufacturing Big Data Development Innovation Laboratory, Henan University of Science and Technology, Luoyang, 471023, China
4 Institute of Artificial Intelligence Innovations, Henan University of Science and Technology, Luoyang, 471023, China
5 Education Technology Department, New H3C Technologies Co., Ltd., Beijing, 100102, China
6 College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China
* Corresponding Author: Zhiyong Zhang. Email: email
(This article belongs to the Special Issue: Artificial Intelligence Methods and Techniques to Cybersecurity)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.073798

Received 25 September 2025; Accepted 05 December 2025; Published online 31 December 2025

Abstract

At inference time, deep neural networks are susceptible to backdoor attacks, which can produce attacker-controlled outputs when inputs contain carefully crafted triggers. Existing defense methods often focus on specific attack types or incur high costs, such as data cleaning or model fine-tuning. In contrast, we argue that it is possible to achieve effective and generalizable defense without removing triggers or incurring high model-cleaning costs. From the attacker’s perspective and based on characteristics of vulnerable neuron activation anomalies, we propose an Adaptive Feature Injection (AFI) method for black-box backdoor detection. AFI employs a pre-trained image encoder to extract multi-level deep features and constructs a dynamic weight fusion mechanism for precise identification and interception of poisoned samples. Specifically, we select the control samples with the largest feature differences from the clean dataset via feature-space analysis, and generate blended sample pairs with the test sample using dynamic linear interpolation. The detection statistic is computed by measuring the divergence G(x) in model output responses. We systematically evaluate the effectiveness of AFI against representative backdoor attacks, including BadNets, Blend, WaNet, and IAB, on three benchmark datasets: MNIST, CIFAR-10, and ImageNet. Experimental results show that AFI can effectively detect poisoned samples, achieving average detection rates of 95.20%, 94.15%, and 86.49% on these datasets, respectively. Compared with existing methods, AFI demonstrates strong cross-domain generalization ability and robustness to unknown attacks.

Keywords

Deep learning; backdoor attacks; universal detection; feature fusion; backward reasoning
  • 12

    View

  • 2

    Download

  • 0

    Like

Share Link