Test Time Prompt Tuning by Optimal Transport for Machine Learning

Research output: Contribution to journalArticlepeer-review

Abstract

Recently, self-supervised learning has drawn lots of attention from researchers. CLIP is a vision-language model that performs cross-modality contrastive pre-training. In this paper, we propose a novel method of prompt tuning by optimal transport to improve zero-shot generalization of the CLIP pre-trained model. Existing entropy-based approaches fail to consider the global structure of output distribution, and cannot align distributions effectively across domains. We develop the Optimal Transport-Test Time Prompt Tuning, named OT-TPT, to resolve this issue. With the help of optimal transport, it can directly align distributions to provide a global regularization effect, and therefore improve robustness against noise and distribution shifts. Moreover, a Sinkhorn regularization term is adopted to provide an efficient and smooth approximation that reduces distribution shifts while improving zero-shot generalization. Experimental results show that the proposed OT-TPT can achieve higher classification accuracies over existing state-of-the-art approaches.

Original languageEnglish (US)
Article number2551017
JournalInternational Journal of Pattern Recognition and Artificial Intelligence
Volume39
Issue number14
DOIs
StatePublished - Nov 1 2025

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Keywords

  • CLIP
  • Test time prompt tuning
  • Wasserstein barycenter
  • machine learning
  • optimal transport
  • self-supervised learning

Fingerprint

Dive into the research topics of 'Test Time Prompt Tuning by Optimal Transport for Machine Learning'. Together they form a unique fingerprint.

Cite this