TY - GEN
T1 - LSCALE: Latent Space Clustering-Based Active Learning for Node Classification
AU - Liu, Juncheng
AU - Wang, Yiwei
AU - Hooi, Bryan
AU - Yang, Renchi
AU - Xiao, Xiaokui
N1 - Funding Information:
Acknowledgements. This paper is supported by the Ministry of Education, Sin-
Funding Information:
This paper is supported by the Ministry of Education, Singapore (Grant Number MOE2018-T2-2-091) and A*STAR, Singapore (Number A19E3b0099).
Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023/3/17
Y1 - 2023/3/17
N2 - Node classification on graphs is an important task in many practical domains. It usually requires labels for training, which can be difficult or expensive to obtain in practice. Given a budget for labelling, active learning aims to improve performance by carefully choosing which nodes to label. Previous graph active learning methods learn representations using labelled nodes and select some unlabelled nodes for label acquisition. However, they do not fully utilize the representation power present in unlabelled nodes. We argue that the representation power in unlabelled nodes can be useful for active learning and for further improving performance of active learning for node classification. In this paper, we propose a latent space clustering-based active learning framework for node classification (LSCALE), where we fully utilize the representation power in both labelled and unlabelled nodes. Specifically, to select nodes for labelling, our framework uses the K-Medoids clustering algorithm on a latent space based on a dynamic combination of both unsupervised features and supervised features. In addition, we design an incremental clustering module to avoid redundancy between nodes selected at different steps. Extensive experiments on five datasets show that our proposed framework LSCALE consistently and significantly outperforms the state-of-the-art approaches by a large margin.
AB - Node classification on graphs is an important task in many practical domains. It usually requires labels for training, which can be difficult or expensive to obtain in practice. Given a budget for labelling, active learning aims to improve performance by carefully choosing which nodes to label. Previous graph active learning methods learn representations using labelled nodes and select some unlabelled nodes for label acquisition. However, they do not fully utilize the representation power present in unlabelled nodes. We argue that the representation power in unlabelled nodes can be useful for active learning and for further improving performance of active learning for node classification. In this paper, we propose a latent space clustering-based active learning framework for node classification (LSCALE), where we fully utilize the representation power in both labelled and unlabelled nodes. Specifically, to select nodes for labelling, our framework uses the K-Medoids clustering algorithm on a latent space based on a dynamic combination of both unsupervised features and supervised features. In addition, we design an incremental clustering module to avoid redundancy between nodes selected at different steps. Extensive experiments on five datasets show that our proposed framework LSCALE consistently and significantly outperforms the state-of-the-art approaches by a large margin.
UR - http://www.scopus.com/inward/record.url?scp=85151045342&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-26387-3_4
DO - 10.1007/978-3-031-26387-3_4
M3 - Conference proceeding
AN - SCOPUS:85151045342
SN - 9783031263866
T3 - Lecture Notes in Computer Science
SP - 55
EP - 70
BT - Machine Learning and Knowledge Discovery in Databases
A2 - Amini, Massih-Reza
A2 - Canu, Stéphane
A2 - Fischer, Asja
A2 - Guns, Tias
A2 - Kralj Novak, Petra
A2 - Tsoumakas, Grigorios
PB - Springer Cham
T2 - 22nd Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022
Y2 - 19 September 2022 through 23 September 2022
ER -