Adaptive Localization of Knowledge Negation for Continual LLM Unlearning

  • Abudukelimu Wuerkaixi
  • , Qizhou Wang
  • , Sen Cui*
  • , Wutong Xu
  • , Bo Han
  • , Gang Niu
  • , Masashi Sugiyama
  • , Changshui Zhang*
  • *Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

3 Citations (Scopus)

Abstract

With the growing deployment of large language models (LLMs) across diverse domains, concerns regarding their safety have grown substantially. LLM unlearning has emerged as a pivotal approach to removing harmful or unlawful contents while maintaining utility. Despite increasing interest, the challenges of continual unlearning, which is common in real-world scenarios, remain underexplored. Successive unlearning tasks often lead to intensified utility degradation. To effectively unlearn targeted knowledge while preserving LLM utility, it is essential to minimize changes in model parameters by selectively updating those linked to the target knowledge, thereby ensuring other knowledge remains unaffected. Building on the task vector framework, we propose a new method named ALKN (Adaptive Localization of Knowledge Negation), which uses dynamic masking to sparsify training gradients and adaptively adjusts unlearning intensity based on inter-task relationships. Comprehensive experiments across three well-established LLM unlearning datasets demonstrate that our approach consistently outperforms baseline methods in both unlearning effectiveness and utility retention under continual unlearning settings.

Original languageEnglish
Title of host publicationProceedings of the 42nd International Conference on Machine Learning, ICML 2025
PublisherML Research Press
Pages68094-68117
Number of pages24
Publication statusPublished - Jul 2025
Event42nd International Conference on Machine Learning, ICML 2025 - Vancouver Convention Center, Vancouver, Canada
Duration: 13 Jul 202519 Jul 2025
https://icml.cc/Conferences/2025 (Conference Website)
https://icml.cc/virtual/2025/calendar (Conference Calendar)
https://proceedings.mlr.press/v267/ (Conference Proceedings)

Publication series

NameProceedings of Machine Learning Research
PublisherML Research Press
Volume267

Conference

Conference42nd International Conference on Machine Learning, ICML 2025
Country/TerritoryCanada
CityVancouver
Period13/07/2519/07/25
Internet address

Fingerprint

Dive into the research topics of 'Adaptive Localization of Knowledge Negation for Continual LLM Unlearning'. Together they form a unique fingerprint.

Cite this