Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings

  • Chenghao Sun
  • , Zhen Huang
  • , Yonggang Zhang
  • , Le Lu
  • , Houqiang Li
  • , Xinmei Tian*
  • , Xu Shen*
  • , Jieping Ye
  • *Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Large language models (LLMs) excel at downstream NLP tasks through in-context learning (ICL) with a few demonstrations of input–label pairs. However, the internal mechanisms behind ICL remain under-explored, particularly the mappings between inputs and labels. In this work, we reverse-engineer ICL by examining input-label mappings: what they are within LLMs, where they function, and how LLMs utilize them. (1) what: We discover input-label mappings stored within a few specific layers in the form of principal components (PCs), which capture human-interpretable and task-related words. (2) where: We propose a PC patching approach to identify the modules where input-label mappings function. Specifically, PC patching automatically crafts counterfactual representations using identified semantic PCs, rather than manually designing counterfactual text, to suppress the behavior related to LLM capability for ICL-related modules. Utilizing PC patching, we identify LLMs apply input-label mappings in a small fraction of attention heads. (3) how: We observe and verify that the identified key heads utilize input-label mappings from demonstrations to generate target labels for new queries. Based on these discoveries, we further show that precisely fine-tuning key ICL-related modules leads to significant improvements across diverse tasks.

Original languageEnglish
Title of host publicationProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
EditorsWanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
PublisherAssociation for Computational Linguistics (ACL)
Pages3873-3895
Number of pages23
ISBN (Electronic)9798891762510
DOIs
Publication statusPublished - Jul 2025
Event63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Austria Center Vienna, Vienna, Austria
Duration: 27 Jul 20251 Aug 2025
https://2025.aclweb.org/ (Conference Website)
https://docs.google.com/spreadsheets/d/1O-n3HPvv8vY0L_kjyP5AtRTcWWjqLk2deCYtrMgCGw4/edit?usp=drive_link (Conference Program)
https://aclanthology.org/events/acl-2025/ (Conference Proceedings)

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Country/TerritoryAustria
CityVienna
Period27/07/251/08/25
Internet address

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 4 - Quality Education
    SDG 4 Quality Education

Fingerprint

Dive into the research topics of 'Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings'. Together they form a unique fingerprint.

Cite this