Steering Protein Language Models

Long-Kai Huang*, Rongyi Zhu, Bing He, Jianhua Yao*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Protein Language Models (PLMs), pre-trained on extensive evolutionary data from natural proteins, have emerged as indispensable tools for protein design. While powerful, PLMs often struggle to produce proteins with precisely specified functionalities or properties due to inherent challenges in controlling their outputs. In this work, we investigate the potential of Activation Steering, a technique originally developed for controlling text generation in Large Language Models (LLMs), to direct PLMs toward generating protein sequences with targeted properties. We propose a simple yet effective method that employs activation editing to steer PLM outputs, and extend this approach to protein optimization through a novel editing site identification module. Through comprehensive experiments on lysozyme-like sequence generation and optimization, we demonstrate that our methods can be seamlessly integrated into both auto-encoding and autoregressive PLMs without requiring additional training. These results highlight a promising direction for precise protein engineering using foundation models. Code is available at https://github.com/Long-Kai/Steering-PLMs.
Original languageEnglish
Title of host publicationProceedings of the 42nd International Conference on Machine Learning, ICML 2025
EditorsAarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Tegan Maharaj, Kiri Wagstaff, Jerry Zhu
PublisherML Research Press
Number of pages14
Publication statusPublished - 13 Jul 2025
Event42nd International Conference on Machine Learning - Vancouver Convention Center, Vancouver, Canada
Duration: 13 Jul 202519 Jul 2025
https://icml.cc/Conferences/2025 (Conference website)
https://icml.cc/virtual/2025/calendar (Conference calendar)

Publication series

NameProceedings of the International Conference on Machine Learning
NameProceedings of Machine Learning Research
Volume267
ISSN (Print)2640-3498

Conference

Conference42nd International Conference on Machine Learning
Abbreviated titleICML 2025
Country/TerritoryCanada
CityVancouver
Period13/07/2519/07/25
Internet address

User-Defined Keywords

  • Protein Language Model
  • Steering
  • Protein Engineering

Fingerprint

Dive into the research topics of 'Steering Protein Language Models'. Together they form a unique fingerprint.

Cite this