Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.074005
Special Issues
Table of Content

Open Access

ARTICLE

Enhancing Detection of AI-Generated Text: A Retrieval-Augmented Dual-Driven Defense Mechanism

Xiaoyu Li1,2, Jie Zhang3, Wen Shi1,2,*
1 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100190, China
2 Key Laboratory of Target Cognition and Application Technology (TCAT), Beijing, 100190, China
3 Department of Computer, North China Electric Power University, Baoding, 071003, China
* Corresponding Author: Wen Shi. Email: email
(This article belongs to the Special Issue: Advances in Large Models and Domain-specific Applications)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.074005

Received 30 September 2025; Accepted 18 November 2025; Published online 12 December 2025

Abstract

The emergence of large language models (LLMs) has brought about revolutionary social value. However, concerns have arisen regarding the generation of deceptive content by LLMs and their potential for misuse. Consequently, a crucial research question arises: How can we differentiate between AI-generated and human-authored text? Existing detectors face some challenges, such as operating as black boxes, relying on supervised training, and being vulnerable to manipulation and misinformation. To tackle these challenges, we propose an innovative unsupervised white-box detection method that utilizes a “dual-driven verification mechanism” to achieve high-performance detection, even in the presence of obfuscated attacks in the text content. To be more specific, we initially employ the SpaceInfi strategy to enhance the difficulty of detecting the text content. Subsequently, we randomly select vulnerable spots from the text and perturb them using another pre-trained language model (e.g., T5). Finally, we apply a dual-driven defense mechanism (D3M) that validates text content with perturbations, whether generated by a model or authored by a human, based on the dimensions of Information Transmission Quality and Information Transmission Density. Through experimental validation, our proposed novel method demonstrates state-of-the-art (SOTA) performance when exposed to equivalent levels of perturbation intensity across multiple benchmarks, thereby showcasing the effectiveness of our strategies.

Keywords

Large language models; machine-written; perturbation; detection; attacks
  • 88

    View

  • 12

    Download

  • 0

    Like

Share Link