Review of large vision models and visual prompt engineering

  • Jiaqi Wang
  • , Zhengliang Liu
  • , Lin Zhao
  • , Zihao Wu
  • , Chong Ma
  • , Sigang Yu
  • , Haixing Dai
  • , Qiushi Yang
  • , Yiheng Liu
  • , Songyao Zhang
  • , Enze Shi
  • , Yi Pan
  • , Tuo Zhang
  • , Dajiang Zhu
  • , Xiang Li
  • , Xi Jiang
  • , Bao Ge
  • , Yixuan Yuan
  • , Dinggang Shen
  • , Tianming Liu
  • Shu Zhang

Research output: Contribution to journalReview articlepeer-review

137 Scopus citations

Abstract

Visual prompt engineering is a fundamental methodology in the field of visual and image artificial general intelligence. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research direction. This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models. It is our hope that this review provides a comprehensive and systematic description of prompt engineering methods based on large visual models, offering valuable insights for future researchers in their exploration of this field.

Original languageEnglish (US)
Article number100047
JournalMeta-Radiology
Volume1
Issue number3
DOIs
StatePublished - Nov 2023
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Radiological and Ultrasound Technology
  • Radiology Nuclear Medicine and imaging
  • Computer Graphics and Computer-Aided Design

Keywords

  • Artificial general intelligence
  • Vision models
  • Visual prompt

Fingerprint

Dive into the research topics of 'Review of large vision models and visual prompt engineering'. Together they form a unique fingerprint.

Cite this