CoTea: Collaborative teaching for low-resource named entity recognition with a divide-and-conquer strategy

Zhiwei Yang, Jing Ma, Kang Yang, Huiru Lin*, Hechang Chen, Ruichao Yang, Yi Chang

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Low-resource named entity recognition (NER) aims to identify entity mentions when training data is scarce. Recent approaches resort to distant data with manual dictionaries for improvement, but such dictionaries are not always available for the target domain and have limited coverage of entities, which may introduce noise. In this paper, we propose a novel Collaborative Teaching (CoTea) framework for low-resource NER with a few supporting labeled examples, which can automatically augment training data and reduce label noise. Specifically, CoTea utilizes the entities in the supporting labeled examples to retrieve entity-related unlabeled data heuristically and then generates accurate distant labels with a novel mining-refining iterative mechanism. For optimizing distant labels, the mechanism mines potential entities from non-entity tokens with a recognition teacher and then refines entity labels with another prompt-based discrimination teacher in a divide-and-conquer manner. Experimental results on two benchmark datasets demonstrate that CoTea outperforms state-of-the-art baselines in low-resource settings and achieves 85% and 65% performance levels of the best high-resource baseline methods by merely utilizing about 2% of labeled data.

Original languageEnglish
Article number103657
Number of pages17
JournalInformation Processing and Management
Volume61
Issue number3
DOIs
Publication statusPublished - May 2024

Scopus Subject Areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Management Science and Operations Research
  • Library and Information Sciences

User-Defined Keywords

  • Collaborative teaching
  • Divide-and-conquer
  • Low resource
  • Named entity recognition

Fingerprint

Dive into the research topics of 'CoTea: Collaborative teaching for low-resource named entity recognition with a divide-and-conquer strategy'. Together they form a unique fingerprint.

Cite this