Open Access iconOpen Access

ARTICLE

Zero-Shot Image Captioning Method Based on the Hamiltonian Monte Carlo

Long Li, Hengyang Wu*, Na Wang

School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai, China

* Corresponding Author: Hengyang Wu. Email: email

(This article belongs to the Special Issue: Advances in Artificial Intelligence for Engineering and Sciences)

Journal on Artificial Intelligence 2026, 8, 169-182. https://doi.org/10.32604/jai.2026.077462

Abstract

Zero-shot learning as an emerging approach in image captioning techniques, has garnered significant attention from researchers in recent years due to its ability to accomplish tasks without requiring specific category training data. Existing zero-shot image captioning schemes largely rely on traditional language models, which exhibit low efficiency and suboptimal generation quality. To address this issue, this study proposes Hamiltonian Monte Carlo for Image Captioning (HMCIC). This method first models the image captioning task as a probabilistic sampling problem in parameter space, integrating semantic matching and syntactic coherence into an energy function to guide the generation process toward high-quality captions. Secondly, it introduces momentum variables from Hamiltonian dynamics, enabling the sampling process to traverse local optima and achieve smoother, more efficient exploration in parameter space, effectively mitigating the “random walk” phenomenon common in traditional sampling. Finally, by iteratively optimizing the sampling trajectory, the generated descriptions achieve a better balance between semantic accuracy and linguistic fluency. This enables more efficient and accurate zero-shot image captioning without requiring category-specific training. Experimental results on two public datasets demonstrate that compared to other current zero-shot methods, our approach achieves nearly 1.5 times faster average generation speed while also improving word generation accuracy. This indicates the effectiveness of the proposed method.

Keywords

Zero-shot; hamiltonian monte carlo; sampling algorithm; image captioning

Cite This Article

APA Style
Li, L., Wu, H., Wang, N. (2026). Zero-Shot Image Captioning Method Based on the Hamiltonian Monte Carlo. Journal on Artificial Intelligence, 8(1), 169–182. https://doi.org/10.32604/jai.2026.077462
Vancouver Style
Li L, Wu H, Wang N. Zero-Shot Image Captioning Method Based on the Hamiltonian Monte Carlo. J Artif Intell. 2026;8(1):169–182. https://doi.org/10.32604/jai.2026.077462
IEEE Style
L. Li, H. Wu, and N. Wang, “Zero-Shot Image Captioning Method Based on the Hamiltonian Monte Carlo,” J. Artif. Intell., vol. 8, no. 1, pp. 169–182, 2026. https://doi.org/10.32604/jai.2026.077462



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 23

    View

  • 7

    Download

  • 0

    Like

Share Link