Open Access iconOpen Access

ARTICLE

Addressing Prompt Injection in Large Language Models via In-Context Learning

Go Sato1, Shusaku Egami1,2, Yasuyuki Tahara1, Akihiko Ohsuga1, Yuichi Sei1,*

1 Department of Informatics, University of Electro-Communications, Tokyo, Japan
2 Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan

* Corresponding Author: Yuichi Sei. Email: email

(This article belongs to the Special Issue: Artificial Intelligence Methods and Techniques to Cybersecurity)

Computers, Materials & Continua 2026, 87(2), 99 https://doi.org/10.32604/cmc.2026.078188

Abstract

While Large Language Models (LLMs) possess the capability to perform a wide range of tasks, security attacks known as prompt injection and jailbreaking remain critical challenges. Existing defense approaches addressing this problem face challenges such as the over-refusal of prompts that contain harmful vocabulary but are semantically benign, and the limited accuracy improvement in machine learning-based approaches due to the ease of distinguishing benign prompts in existing datasets. Therefore, we propose a multi-LLM agent framework aimed at achieving both the accurate rejection of harmful prompts and appropriate responses to benign prompts. Distinct from prior studies, the proposed method adopts In-Context Learning (ICL) during the learning phase, presenting a novel approach that obviates the need for computationally expensive parameter updates required by conventional fine-tuning. To demonstrate the proposed method’s capability for rapid and easy deployment, this study targets LLMs with insufficient alignment. In the experiments, macro-averaged binary classification metrics were used to comprehensively evaluate harmfulness detection. Experimental results using three LLMs demonstrated that the proposed method achieved performance that surpassed four baselines across all evaluation metrics for the target LLMs, evidencing significant effectiveness with an average improvement of 16.6 points in F1-score compared to the vanilla models. The significance of this study lies in the proposal of a novel approach based on ICL that does not require parameter updates. This framework offers high sustainability in practical deployment, as it allows for the adaptive enhancement of detection performance against continuously evolving attack methods solely through the accumulation of logs, without the necessity of retraining the LLM itself. By mitigating the trade-off between safety and utility, this research contributes to the implementation of robust LLMs.

Keywords

Large language models (LLMs); prompt injection; in-context learning (ICL); multi-agent system

Cite This Article

APA Style
Sato, G., Egami, S., Tahara, Y., Ohsuga, A., Sei, Y. (2026). Addressing Prompt Injection in Large Language Models via In-Context Learning. Computers, Materials & Continua, 87(2), 99. https://doi.org/10.32604/cmc.2026.078188
Vancouver Style
Sato G, Egami S, Tahara Y, Ohsuga A, Sei Y. Addressing Prompt Injection in Large Language Models via In-Context Learning. Comput Mater Contin. 2026;87(2):99. https://doi.org/10.32604/cmc.2026.078188
IEEE Style
G. Sato, S. Egami, Y. Tahara, A. Ohsuga, and Y. Sei, “Addressing Prompt Injection in Large Language Models via In-Context Learning,” Comput. Mater. Contin., vol. 87, no. 2, pp. 99, 2026. https://doi.org/10.32604/cmc.2026.078188



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 270

    View

  • 56

    Download

  • 0

    Like

Share Link