Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.084208
Special Issues
Table of Content

Open Access

ARTICLE

LaRP-CLIP: Layer-Aware Refinement with Prototype Guidance for Zero-Shot Anomaly Detection

Xing Fang1, Yuanfang Chen1,2,*, Qiang Lin3, Kun Yang2,4, Gyu Myoung Lee5
1 School of Cyberspace, Hangzhou Dianzi University, Hangzhou, China
2 The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, China
3 School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
4 College of Computer Science and Technology, Zhejiang University, Hangzhou, China
5 School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, UK
* Corresponding Author: Yuanfang Chen. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.084208

Received 18 April 2026; Accepted 22 May 2026; Published online 22 June 2026

Abstract

The deployment of supervised anomaly detection is typically limited by the high cost of annotation, privacy constraints, and the scarcity of anomalous samples. These constraints have motivated the use of vision-language pre-trained models for zero-shot anomaly detection. However, existing CLIP-based methods still face three limitations: a shared set of prompts is applied across feature layers, anomaly maps are fused by fixed strategies, and image-level anomaly scores are determined solely by global image-text similarity. These limitations reduce the accuracy of pixel-level localization and weaken the reliability of image-level anomaly prediction. To overcome these limitations, LaRP-CLIP is proposed. It introduces layer-aware prompt decoupling to better match feature layers with different semantic characteristics, adaptive fusion with error-prior-guided local refinement to produce cleaner and more precise anomaly maps, and a prototype branch to improve image-level scoring. Experiments on four industrial datasets and seven medical datasets show that LaRP-CLIP achieves strong performance in both image-level detection and pixel-level localization.

Keywords

Zero-shot anomaly detection; vision-language models; layer-aware prompts; local refinement; prototype branch
  • 108

    View

  • 22

    Download

  • 0

    Like

Share Link