TY - JOUR
T1 - Federated Semi-Supervised Learning with Annotation Heterogeneity
AU - Shang, Xinyi
AU - Huang, Gang
AU - Lu, Yang
AU - Lou, Jian
AU - Han, Bo
AU - Cheung, Yiu-ming
AU - Wang, Hanzi
N1 - This study was supported in part by the National Natural Science Foundation of China under Grants 62376233, 62431004, 62206207; in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LZ25E070002; in part by the Guangdong and Hong Kong Universities ”1+1+1” Cross-Campus Research Collaboration Scheme under Grant 2025A0505000004; in part by the Initiation Grant for Faculty Niche Research Areas of Hong Kong Baptist University under Grant RC- FNRA-IG/23-24/SCI/02; in part by the Natural Science Foundation of Fujian Province under Grant 2024J09001; and in part by Xiaomi Young Talents Program.
Publisher Copyright:
© 2025 IEEE
PY - 2025/12/19
Y1 - 2025/12/19
N2 - Federated Semi-Supervised Learning (FSSL) aims to learn a global model from different clients in an environment with both labeled and unlabeled data. Most of the existing FSSL work generally assumes that both types of data are available on each client. In this paper, we study a more general problem setup of FSSL with annotation heterogeneity, where each client can hold an arbitrary percentage (0%-100%) of labeled data. To this end, we propose a novel FSSL framework called Heterogeneously Annotated Semi-Supervised LEarning (HASSLE). Specifically, it employs a dual-model approach. Two models with the same architecture are trained separately: one uses labeled data only, and the other uses unlabeled data. This design enables the framework to be applied to clients with arbitrary labeling percentages. Furthermore, a mutual learning strategy called Supervised-Unsupervised Mutual Alignment (SUMA) is proposed for the dual models with global residual alignment and model proximity alignment. Subsequently, the dual models can implicitly learn from both types of data across different clients, although each dual model is only trained locally on a single type of data. Experiments verify that the dual models in HASSLE learned by SUMA can mutually learn from each other, thereby effectively utilizing the information of both types of data across different clients.
AB - Federated Semi-Supervised Learning (FSSL) aims to learn a global model from different clients in an environment with both labeled and unlabeled data. Most of the existing FSSL work generally assumes that both types of data are available on each client. In this paper, we study a more general problem setup of FSSL with annotation heterogeneity, where each client can hold an arbitrary percentage (0%-100%) of labeled data. To this end, we propose a novel FSSL framework called Heterogeneously Annotated Semi-Supervised LEarning (HASSLE). Specifically, it employs a dual-model approach. Two models with the same architecture are trained separately: one uses labeled data only, and the other uses unlabeled data. This design enables the framework to be applied to clients with arbitrary labeling percentages. Furthermore, a mutual learning strategy called Supervised-Unsupervised Mutual Alignment (SUMA) is proposed for the dual models with global residual alignment and model proximity alignment. Subsequently, the dual models can implicitly learn from both types of data across different clients, although each dual model is only trained locally on a single type of data. Experiments verify that the dual models in HASSLE learned by SUMA can mutually learn from each other, thereby effectively utilizing the information of both types of data across different clients.
KW - Annotation Heterogeneity
KW - Data Heterogeneity
KW - Federated Learning
KW - Semi-Supervised Learning
UR - https://www.scopus.com/pages/publications/105025696919
U2 - 10.1109/TAI.2025.3643401
DO - 10.1109/TAI.2025.3643401
M3 - Journal article
SN - 2691-4581
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
ER -