TY - JOUR
T1 - SENA
T2 - Leveraging set-level consistency adversarial learning for robust pre-trained language model adaptation
AU - Gao, Jianqi
AU - Cao, Jian
AU - Yu, Hang
AU - Zhang, Yonggang
AU - Fang, Zhen
N1 - This work is partially supported by the Program of Technology Innovation of the Science and Technology Commission of Shanghai Municipality, China (Granted No. 21511104700 and 22DZ1100103), as well as the Shanghai Committee of Science and Technology, China (Grant No. 23ZR1423500).
Publisher Copyright:
© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies
PY - 2025/6/9
Y1 - 2025/6/9
N2 - Using pre-trained language models (PLM) to generate embeddings for downstream tasks has achieved great success in recent years. The pre-trained embeddings can be adapted to downstream tasks by encouraging the embedding similarity among samples within the same class through auxiliary tasks with contrastive learning (CL) objectives. However, existing methods face two issues: (i) class imbalance and over-representation caused by instance sampling bias in CL, and (ii) gradient conflicts between auxiliary and downstream tasks. To deal with these issues, we propose a novel approach called set-level consistency adversarial learning (SENA). Specifically, SENA leverages on two techniques, i.e., instance-to-set function and consistency adversarial learning, to yeild task-specific embeddings. To mitigate the issue of instance sampling bias in CL, SENA incorporates set-level discriminative features into individual instance embeddings by employing an instance-to-set function, which are then employed as prototypes for each category in contrastive learning. Additionally, to tackle gradient conflicts between CL and downstream tasks, SENA first identifies the most inconsistent cases and then eliminates the inconsistency in an adversarial learning manner. SENA is validated on GLUE benchmark and three intent classification datasets. Comprehensive experiments demonstrate the effectiveness of SENA on various tasks.
AB - Using pre-trained language models (PLM) to generate embeddings for downstream tasks has achieved great success in recent years. The pre-trained embeddings can be adapted to downstream tasks by encouraging the embedding similarity among samples within the same class through auxiliary tasks with contrastive learning (CL) objectives. However, existing methods face two issues: (i) class imbalance and over-representation caused by instance sampling bias in CL, and (ii) gradient conflicts between auxiliary and downstream tasks. To deal with these issues, we propose a novel approach called set-level consistency adversarial learning (SENA). Specifically, SENA leverages on two techniques, i.e., instance-to-set function and consistency adversarial learning, to yeild task-specific embeddings. To mitigate the issue of instance sampling bias in CL, SENA incorporates set-level discriminative features into individual instance embeddings by employing an instance-to-set function, which are then employed as prototypes for each category in contrastive learning. Additionally, to tackle gradient conflicts between CL and downstream tasks, SENA first identifies the most inconsistent cases and then eliminates the inconsistency in an adversarial learning manner. SENA is validated on GLUE benchmark and three intent classification datasets. Comprehensive experiments demonstrate the effectiveness of SENA on various tasks.
KW - Contrastive learning
KW - Gradient conflicts
KW - Instance-to-set function
KW - Pre-trained language models
KW - Set-level consistency adversarial learning
UR - http://www.scopus.com/inward/record.url?scp=105008228316&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2025.113831
DO - 10.1016/j.knosys.2025.113831
M3 - Journal article
SN - 0950-7051
VL - 324
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 113831
ER -