Skip to main navigation Skip to search Skip to main content

Adapting Vision Foundation Models for Real-Time Ultrasound Image Segmentation

  • Xiaoran Zhang
  • , Eric Z. Chen
  • , Lin Zhao
  • , Xiao Chen
  • , Yikang Liu
  • , Boris Maihe
  • , James S. Duncan
  • , Terrence Chen
  • , Shanhui Sun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a novel approach that adapts hierarchical vision foundation models for real-time ultrasound image segmentation. Existing ultrasound segmentation methods often struggle with adaptability to new tasks, relying on costly manual annotations, while real-time approaches generally fail to match state-of-the-art performance. To overcome these limitations, we introduce an adaptive framework that leverages the vision foundation model Hiera to extract multi-scale features, interleaved with DINOv2 representations to enhance visual expressiveness. These enriched features are then decoded to produce precise and robust segmentation. We conduct extensive evaluations on six public datasets and one in-house dataset, covering both cardiac and thyroid ultrasound segmentation. Experiments show that our approach outperforms state-of-the-art methods across multiple datasets and excels with limited supervision, surpassing nnUNet by over 20% on average in the 1% and 10% data settings. Our method achieves ∼77 FPS inference speed with TensorRT on a single GPU, enabling real-time clinical applications.

Original languageEnglish (US)
Title of host publicationMedical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, 2025, Proceedings
EditorsJames C. Gee, Jaesung Hong, Carole H. Sudre, Polina Golland, Daniel C. Alexander, Juan Eugenio Iglesias, Archana Venkataraman, Jong Hyo Kim
PublisherSpringer Science and Business Media Deutschland GmbH
Pages24-34
Number of pages11
ISBN (Print)9783032049704
DOIs
StatePublished - 2026
Externally publishedYes
Event28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - Daejeon, Korea, Republic of
Duration: Sep 23 2025Sep 27 2025

Publication series

NameLecture Notes in Computer Science
Volume15964 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Country/TerritoryKorea, Republic of
CityDaejeon
Period9/23/259/27/25

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Keywords

  • Real-time inference
  • Ultrasound image segmentation
  • Vision foundation model

Fingerprint

Dive into the research topics of 'Adapting Vision Foundation Models for Real-Time Ultrasound Image Segmentation'. Together they form a unique fingerprint.

Cite this