Granular-ball representation-based two-stage deep learning model for text classification

  • Wenbin Qian*
  • , Ying He
  • , Xingxing Cai
  • , Jintao Huang
  • *Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Text classification, which involves the automatic assignment of texts to specific categories, has a broad range of applications in the real world. However, many existing approaches rely on pre-trained language models that, while efficient in learning global linguistic patterns, may struggle with mapping abstract labels to textual data. Additionally, concerns have been raised regarding the robustness of these models and the lack of transparency in their decision-making processes. To address these issues, this paper introduces a novel two-stage learning model for text categorization, which is based on granular-ball representation (TSM-GBR). Initially, texts are transformed into embedding vectors, followed by the generation of granular-balls based on these vectors. Subsequently, a hierarchical strategy based on three-way decision is devised to compute the semantic information of labels. The concept of text confidence is introduced to address samples that the granular-ball model is unable to classify effectively. In the subsequent stage, the semantic representation of word embeddings is refined based on the actual semantics of the labels, with further classification of texts that exhibit low confidence. Considering the limitations of deep learning models in processing semantic information through a single granularity, a dual channel pooling model is designed, which utilizes the max-pooling and the mean-pooling to extract multi-granularity information from the text. Compared with the baseline methods, the proposed model exhibits competitive performance in terms of accuracy and F1-score across various datasets. Extensive comparative experiments confirm that the comprehensive integration of label information significantly enhances text classification. The source codes are available at https://gitee.com/TomisHy/tsm-gbr/tree/master/.

Original languageEnglish
Article number1129
Number of pages15
JournalApplied Intelligence
Volume55
Issue number18
Early online date3 Dec 2025
DOIs
Publication statusPublished - Dec 2025

User-Defined Keywords

  • Granular-ball computing
  • Label information
  • Neural network
  • Text classification
  • Three-way decision

Fingerprint

Dive into the research topics of 'Granular-ball representation-based two-stage deep learning model for text classification'. Together they form a unique fingerprint.

Cite this