Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.074081
Special Issues
Table of Content

Open Access

REVIEW

Prompt Injection Attacks on Large Language Models: A Survey of Attack Methods, Root Causes, and Defense Strategies

Tongcheng Geng1,#, Zhiyuan Xu2,#, Yubin Qu3,*, W. Eric Wong4
1 Department of Information and Network Security, The State Information Center, Beijing, 100032, China
2 Department of Mechanical Engineering, Hohai University, Changzhou, 213200, China
3 School of Information Engineering, Jiangsu College of Engineering and Technology, Nantong, 226001, China
4 Department of Computer Science, University of Texas at Dallas, Dallas, TX 75080, USA
* Corresponding Author: Yubin Qu. Email: email
# Tongcheng Geng and Zhiyuan Xu contributed equally to this work
(This article belongs to the Special Issue: Large Language Models in Password Authentication Security: Challenges, Solutions and Future Directions)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.074081

Received 01 October 2025; Accepted 25 November 2025; Published online 18 December 2025

Abstract

Large language models (LLMs) have revolutionized AI applications across diverse domains. However, their widespread deployment has introduced critical security vulnerabilities, particularly prompt injection attacks that manipulate model behavior through malicious instructions. Following Kitchenham’s guidelines, this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape. Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks, achieving over 90% success rates against unprotected systems. In response, defense mechanisms show varying effectiveness: input preprocessing achieves 60%–80% detection rates and advanced architectural defenses demonstrate up to 95% protection against known patterns, though significant gaps persist against novel attack vectors. We identified 37 distinct defense approaches across three categories, but standardized evaluation frameworks remain limited. Our analysis attributes these vulnerabilities to fundamental LLM architectural limitations, such as the inability to distinguish instructions from data and attention mechanism vulnerabilities. This highlights critical research directions such as formal verification methods, standardized evaluation protocols, and architectural innovations for inherently secure LLM designs.

Keywords

Prompt injection attacks; large language models; defense mechanisms; security evaluation
  • 52

    View

  • 10

    Download

  • 0

    Like

Share Link