Shaping pre-trained language models for task-specific embedding generation via consistency calibration

Jianqi Gao, Hang Yu, Yiu Ming Cheung, Jian Cao*, Raymond Chi Wing Wong, Yonggang Zhang*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific embedding generator. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose consistency calibration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using 8 datasets across 6 task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.
Original languageEnglish
Article number107754
Number of pages11
JournalNeural Networks
Volume191
Early online date21 Jun 2025
DOIs
Publication statusE-pub ahead of print - 21 Jun 2025

User-Defined Keywords

  • Adversarial training
  • Consistency calibration (coCa)
  • Forgetting of pre-trained knowledge
  • Pre-trained language models
  • Task-specific embedding generator
  • Task-specific fine-tuning
  • Consistency calibration (CoCa)

Fingerprint

Dive into the research topics of 'Shaping pre-trained language models for task-specific embedding generation via consistency calibration'. Together they form a unique fingerprint.

Cite this