TY - JOUR
T1 - Federated Principal Component Analysis for Vertically Partitioned Data
AU - Cheung, Yiu-ming
AU - Zhang, Yonggang
AU - Jiang, Juyong
AU - Yu, Feng
AU - Lou, Jian
N1 - This work was supported in part by the grant for Faculty Niche Research Areas (FNRA) of Hong Kong Baptist University with the grant: RC-FNRA-IG/23-24/SCI/02, and the RGC Senior Research Fellow Scheme with the grant: SRFS2324-2S02.
PY - 2025/11/12
Y1 - 2025/11/12
N2 - Principal Component Analysis (PCA) remains a fundamental technique for unsupervised dimensionality reduction. However, its traditional centralized implementation poses challenges in the context of increasing data privacy concerns, particularly in scenarios involving vertically partitioned data across multiple clients. To address these challenges, we introduce VFedPCA and VFedAKPCA, pioneering federated algorithms designed for linear and nonlinear PCA in such distributed environments. These algorithms facilitate collaborative PCA computations across distributed clients while preserving data privacy by avoiding raw data exchanges. VFed- PCA leverages a local power iteration strategy enhanced by a warm-start mechanism to accelerate convergence. Meanwhile, VFedAKPCA innovatively extends this approach to kernel spaces, employing a novel weight-scaling technique for effective nonlinear feature extraction. We validate the efficacy of our proposed methods through extensive experiments on five real-world datasets, evaluating both server-client and peer-to-peer architectures. Our results indicate that these federated approaches achieve performance on par with traditional centralized PCA methods. The implementation code is publicly accessible at https://github.com/juyongjiang/VFedAKPCA">https://github.com/juyongjiang/VFedAKPCA, facilitating further research and application.
AB - Principal Component Analysis (PCA) remains a fundamental technique for unsupervised dimensionality reduction. However, its traditional centralized implementation poses challenges in the context of increasing data privacy concerns, particularly in scenarios involving vertically partitioned data across multiple clients. To address these challenges, we introduce VFedPCA and VFedAKPCA, pioneering federated algorithms designed for linear and nonlinear PCA in such distributed environments. These algorithms facilitate collaborative PCA computations across distributed clients while preserving data privacy by avoiding raw data exchanges. VFed- PCA leverages a local power iteration strategy enhanced by a warm-start mechanism to accelerate convergence. Meanwhile, VFedAKPCA innovatively extends this approach to kernel spaces, employing a novel weight-scaling technique for effective nonlinear feature extraction. We validate the efficacy of our proposed methods through extensive experiments on five real-world datasets, evaluating both server-client and peer-to-peer architectures. Our results indicate that these federated approaches achieve performance on par with traditional centralized PCA methods. The implementation code is publicly accessible at https://github.com/juyongjiang/VFedAKPCA">https://github.com/juyongjiang/VFedAKPCA, facilitating further research and application.
KW - Advanced Kernel PCA
KW - Feature-wise Distributed Data
KW - Federated Learning
KW - Kernel PCA
KW - PCA
UR - https://www.scopus.com/pages/publications/105021530175
U2 - 10.1109/TCDS.2025.3631744
DO - 10.1109/TCDS.2025.3631744
M3 - Journal article
SN - 2379-8920
SP - 1
EP - 14
JO - IEEE Transactions on Cognitive and Developmental Systems
JF - IEEE Transactions on Cognitive and Developmental Systems
ER -