A Lens into Interpretable Transformer Mistakes via Semantic Dependency

  • Ruo Jing Dong
  • , Yu Yao
  • , Bo Han
  • , Tongliang Liu*
  • *Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Semantic Dependency refers to the relationship between words in a sentence where the meaning of one word depends on another, which is important for natural language understanding. In this paper, we investigate the role of semantic dependencies in answering questions for transformer models, which is achieved by analyzing how token values shift in response to changes in semantics. Through extensive experiments on models including the BERT series, GPT, and LLaMA, we uncover the following key findings: 1). Most tokens primarily retain their original semantic information even as they propagate through multiple layers. 2). Models can encode truthful semantic dependencies in tokens in the final layer. 3). Mistakes in model answers often stem from specific tokens encoded with incorrect semantic dependencies. Furthermore, we found that addressing the incorrectness by directly adjusting parameters is challenging because the same parameters can encode both correct and incorrect semantic dependencies depending on the context. Our findings provide insights into the causes of incorrect information generation in transformers and help the future development of robust and reliable models.

Original languageEnglish
Title of host publicationProceedings of the 42nd International Conference on Machine Learning, ICML 2025
PublisherML Research Press
Pages14260-14284
Number of pages25
Publication statusPublished - Jul 2025
Event42nd International Conference on Machine Learning, ICML 2025 - Vancouver Convention Center, Vancouver, Canada
Duration: 13 Jul 202519 Jul 2025
https://icml.cc/Conferences/2025 (Conference Website)
https://icml.cc/virtual/2025/calendar (Conference Calendar)
https://proceedings.mlr.press/v267/ (Conference Proceedings)

Publication series

NameProceedings of Machine Learning Research
PublisherML Research Press
Volume267

Conference

Conference42nd International Conference on Machine Learning, ICML 2025
Country/TerritoryCanada
CityVancouver
Period13/07/2519/07/25
Internet address

Fingerprint

Dive into the research topics of 'A Lens into Interpretable Transformer Mistakes via Semantic Dependency'. Together they form a unique fingerprint.

Cite this