Abstract
Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific embedding generator. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose consistency calibration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using 8 datasets across 6 task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.
| Original language | English |
|---|---|
| Article number | 107754 |
| Number of pages | 11 |
| Journal | Neural Networks |
| Volume | 191 |
| Early online date | 21 Jun 2025 |
| DOIs | |
| Publication status | Published - Nov 2025 |
User-Defined Keywords
- Adversarial training
- Consistency calibration (coCa)
- Forgetting of pre-trained knowledge
- Pre-trained language models
- Task-specific embedding generator
- Task-specific fine-tuning
- Consistency calibration (CoCa)
Fingerprint
Dive into the research topics of 'Shaping pre-trained language models for task-specific embedding generation via consistency calibration'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver