Skip to main navigation Skip to search Skip to main content

Efficient Tuning of Vision Foundation Models with Neural Prompt Search

  • Yuanhan Zhang
  • , Kaiyang Zhou
  • , Ziwei Liu*
  • *Corresponding author for this work

Research output: Chapter in book/report/conference proceedingChapterpeer-review

Abstract

The size of vision models has grown exponentially in recent years, particularly with the rise of Vision Transformers. This rapid growth has driven the development of parameter-efficient tuning methods, such as learning adapter layers or low-rank adaptation layers, which enable fine-tuning of a small subset of model parameters while keeping the vast majority of pretrained parameters frozen. However, designing an effective tuning method is not straightforward: it often involves exploring numerous design choices, and each downstream dataset may require custom-tailored solutions. In this chapter, we introduce Neural prOmpt seArcH (NOAH), a novel approach that leverages a neural architecture search algorithm to automatically learn the optimal design of prompt modules for large vision models, tailored specifically for each downstream dataset.

Original languageEnglish
Title of host publicationLarge Vision-Language Models
Subtitle of host publicationPre-training, Prompting, and Applications
EditorsKaiyang Zhou, Ziwei Liu, Peng Gao
Place of PublicationCham
PublisherSpringer Cham
Chapter8
Pages187-206
Number of pages20
ISBN (Electronic)9783031949692
ISBN (Print)9783031949685, 9783031949715
DOIs
Publication statusPublished - 30 Aug 2025

Publication series

NameAdvances in Computer Vision and Pattern Recognition
VolumePart F886
ISSN (Print)2191-6586
ISSN (Electronic)2191-6594

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 9 - Industry, Innovation, and Infrastructure
    SDG 9 Industry, Innovation, and Infrastructure

User-Defined Keywords

  • Adapters
  • Fine-tuning
  • Foundation model
  • Image classification
  • Low-rank adaptation
  • Neural architecture searchNeural architecture search
  • Prompt learning
  • Transformers

Fingerprint

Dive into the research topics of 'Efficient Tuning of Vision Foundation Models with Neural Prompt Search'. Together they form a unique fingerprint.

Cite this