Prompt Injection Attacks on Large Language Models: A Survey of Attack Methods, Root Causes, and Defense Strategies

Tongcheng Geng^1,#, Zhiyuan Xu^2,#, Yubin Qu^3,*, W. Eric Wong⁴
1 Department of Information and Network Security, The State Information Center, Beijing, 100032, China
2 Department of Mechanical Engineering, Hohai University, Changzhou, 213200, China
3 School of Information Engineering, Jiangsu College of Engineering and Technology, Nantong, 226001, China
4 Department of Computer Science, University of Texas at Dallas, Dallas, TX 75080, USA
* Corresponding Author: Yubin Qu. Email: email
# Tongcheng Geng and Zhiyuan Xu contributed equally to this work
(This article belongs to the Special Issue: Large Language Models in Password Authentication Security: Challenges, Solutions and Future Directions)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.074081

Received 01 October 2025; Accepted 25 November 2025; Published online 18 December 2025

Download PDF

Abstract

Large language models (LLMs) have revolutionized AI applications across diverse domains. However, their widespread deployment has introduced critical security vulnerabilities, particularly prompt injection attacks that manipulate model behavior through malicious instructions. Following Kitchenham’s guidelines, this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape. Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks, achieving over 90% success rates against unprotected systems. In response, defense mechanisms show varying effectiveness: input preprocessing achieves 60%–80% detection rates and advanced architectural defenses demonstrate up to 95% protection against known patterns, though significant gaps persist against novel attack vectors. We identified 37 distinct defense approaches across three categories, but standardized evaluation frameworks remain limited. Our analysis attributes these vulnerabilities to fundamental LLM architectural limitations, such as the inability to distinguish instructions from data and attention mechanism vulnerabilities. This highlights critical research directions such as formal verification methods, standardized evaluation protocols, and architectural innovations for inherently secure LLM designs.

Keywords

Prompt injection attacks; large language models; defense mechanisms; security evaluation

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

5402

View
3075

Download
0

Like

TRUSED: A Trust-Based Security Evaluation Scheme for A Distributed Control System
Saqib Ali, Raja Waseem Anwar
A New Multi Chaos-Based Compression Sensing Image Encryption
Fadia Ali Khan, Jameel Ahmed,...
Cross-Domain Data Traceability Mechanism Based on Blockchain
Shoucai Zhao, Lifeng Cao, Jinhui...
VeriFace: Defending against Adversarial Attacks in Face Verification Systems
Awny Sayed, Sohair Kilany, Alaa...
Blockchain Security Threats and Collaborative Defense: A Literature Review
Xiulai Li, Jieren Cheng, Zhaoxin...

All issues

Online First

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Prompt Injection Attacks on Large Language Models: A Survey of Attack Methods, Root Causes, and Defense Strategies

Abstract

Keywords

5402

3075

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link