TY - JOUR
T1 - Shaping pre-trained language models for task-specific embedding generation via consistency calibration
AU - Gao, Jianqi
AU - Yu, Hang
AU - Cheung, Yiu Ming
AU - Cao, Jian
AU - Wong, Raymond Chi Wing
AU - Zhang, Yonggang
N1 - Funding information:
This work is supported by the Interdisciplinary Program of Shanghai Jiao Tong University (Grant No. YG2024QNB05).
Publisher copyright:
© 2025 Elsevier Ltd.
PY - 2025/6/21
Y1 - 2025/6/21
N2 - Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific embedding generator. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose consistency calibration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using 8 datasets across 6 task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.
AB - Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific embedding generator. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose consistency calibration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using 8 datasets across 6 task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.
KW - Adversarial training
KW - Consistency calibration (coCa)
KW - Forgetting of pre-trained knowledge
KW - Pre-trained language models
KW - Task-specific embedding generator
KW - Task-specific fine-tuning
KW - Consistency calibration (CoCa)
UR - http://www.scopus.com/inward/record.url?scp=105008827343&partnerID=8YFLogxK
U2 - 10.1016/j.neunet.2025.107754
DO - 10.1016/j.neunet.2025.107754
M3 - Journal article
SN - 0893-6080
VL - 191
JO - Neural Networks
JF - Neural Networks
M1 - 107754
ER -