TY  - EJOU
AU  - Ding, Xiaolei 
AU  - Liu, Wenjian 

TI  - A Game-Theoretic Framework for Strategic Machine Unlearning in Backdoor Mitigation
T2  - Computers, Materials \& Continua

PY  - 
VL  - 
IS  - 
SN  - 1546-2226

AB  - Backdoor attacks pose a critical threat to the reliability and trustworthiness of machine learning models, as they allow adversaries to manipulate model behavior through the injection of malicious patterns during training. Existing defenses, such as data filtering, fine-tuning, and model pruning, often lack provable guarantees or require retraining from scratch, resulting in significant computational costs. In this work, we propose <i>GTMU</i> (Game-Theoretic Machine Unlearning), a novel backdoor removal framework that formulates the unlearning process as a repeated game between the defender and a virtual attacker. The defender aims to strategically remove poisoned contributions while preserving benign knowledge, whereas the virtual attacker attempts to maintain the backdoor’s effectiveness. We introduce a Stackelberg game formulation to determine optimal unlearning policies and integrate a Nash equilibrium-based update rule to balance model utility and security. Our method leverages influence function approximations to estimate per-sample contribution and employs a regret-minimization strategy to adaptively select unlearning candidates. Experimental evaluations on image classification benchmarks under various backdoor settings demonstrate that GTMU consistently achieves over 95% clean accuracy while reducing backdoor success rates to below 2%, outperforming state-of-the-art backdoor defense methods in both efficiency and robustness. The proposed approach offers a theoretically grounded and computationally efficient solution for secure model deployment in adversarial environments.
KW  - Machine learning; backdoor defense; game theory

DO  - 10.32604/cmc.2025.072458