Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.077556
Special Issues
Table of Content

Open Access

ARTICLE

LiRA-CLIP: Training-Free Posterior-Predictive Uncertainty for Few-Shot CLIP Classification

Mustafa Qaid Khamisi1, Zuping Zhang1,*, Mohammed Al-Habib1, Muhammad Asim2, Sajid Shah2
1 School of Computer Science and Engineering, Central South University, Changsha, China
2 EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
* Corresponding Author: Zuping Zhang. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.077556

Received 11 December 2025; Accepted 13 February 2026; Published online 02 April 2026

Abstract

Large Vision-Language models (VLMs) such as Contrastive Language-Image Pretraining (CLIP) have transformed open world image recognition. Nevertheless, few-shot classification, particularly in the extremely low-shot regime, requires not only high accuracy but also reliably calibrated uncertainty for decisions with high confidence. Existing training-free CLIP adapters are primarily designed to increase accuracy and efficiency; integrate the zero-shot text logits with the few-shot feature caches, but not definitely model predictive uncertainty and therefore often exhibit considerable miscalibration and weak selective performance. Bayesian adapters move in the direction of probabilistic modeling by placing priors over adapter parameters and employing task-specific variational training; however, this requires gradient-based optimization for every new task, increases computational costs, and becomes fragile when only one or two labeled examples per class are available. Starting from this observation, we introduce a training-free posterior-predictive Likelihood Ratio Adapter(LiRA-CLIP) for few-shot CLIP classification, which directly addresses probabilistic reliability under strict low-shot and deployment constraints. LiRA-CLIP extends the frozen CLIP head by a text-conditioned generative model in feature space that produces heavy-tailed posterior-predictive likelihood ratios, fused with the CLIP logits via a small, reliability-driven calibration layer. This layer is optimized in order to minimize the negative log-likelihood under an explicit accuracy side constraint, which leads to calibrated probabilities and dependable selective decisions without any gradient-based task-specific training. Extensive experiments show that LiRA-CLIP matches or slightly surpasses strong CLIP adapters in top-1 accuracy, while reducing calibration error by roughly 40%–50% and significantly increasing 95% and 99% reliable coverage in the low-shot regime, and thus establishes a new state of the art with respect to probabilistic reliability for training-free few-shot CLIP models.

Keywords

Vision–language models; few-shot learning; CLIP; training-free; uncertainty calibration; selective classification; posterior predictive modeling
  • 146

    View

  • 62

    Download

  • 0

    Like

Share Link