TY - JOUR
T1 - Privacy preservation in federated learning
T2 - An insightful survey from the GDPR perspective
AU - Truong, Nguyen
AU - Sun, Kai
AU - Wang, Siyao
AU - Guitton, Florian
AU - Guo, Yi Ke
N1 - Funding Information:
This research was supported by the HNA Research Centre for Future Data Ecosystems at Imperial College London and the Innovative Medicines Initiative 2 IDEA-FAST project under grant agreement No 853981.
PY - 2021/11
Y1 - 2021/11
N2 - In recent years, along with the blooming of Machine Learning (ML)-based applications and services, ensuring data privacy and security have become a critical obligation. ML-based service providers not only confront with difficulties in collecting and managing data across heterogeneous sources but also challenges of complying with rigorous data protection regulations such as EU/UK General Data Protection Regulation (GDPR). Furthermore, conventional centralised ML approaches have always come with long-standing privacy risks to personal data leakage, misuse, and abuse. Federated learning (FL) has emerged as a prospective solution that facilitates distributed collaborative learning without disclosing original training data. Unfortunately, retaining data and computation on-device as in FL are not sufficient for privacy-guarantee because model parameters exchanged among participants conceal sensitive information that can be exploited in privacy attacks. Consequently, FL-based systems are not naturally compliant with the GDPR. This article is dedicated to surveying of state-of-the-art privacy-preservation techniques in FL in relations with GDPR requirements. Furthermore, insights into the existing challenges are examined along with the prospective approaches following the GDPR regulatory guidelines that FL-based systems shall implement to fully comply with the GDPR.
AB - In recent years, along with the blooming of Machine Learning (ML)-based applications and services, ensuring data privacy and security have become a critical obligation. ML-based service providers not only confront with difficulties in collecting and managing data across heterogeneous sources but also challenges of complying with rigorous data protection regulations such as EU/UK General Data Protection Regulation (GDPR). Furthermore, conventional centralised ML approaches have always come with long-standing privacy risks to personal data leakage, misuse, and abuse. Federated learning (FL) has emerged as a prospective solution that facilitates distributed collaborative learning without disclosing original training data. Unfortunately, retaining data and computation on-device as in FL are not sufficient for privacy-guarantee because model parameters exchanged among participants conceal sensitive information that can be exploited in privacy attacks. Consequently, FL-based systems are not naturally compliant with the GDPR. This article is dedicated to surveying of state-of-the-art privacy-preservation techniques in FL in relations with GDPR requirements. Furthermore, insights into the existing challenges are examined along with the prospective approaches following the GDPR regulatory guidelines that FL-based systems shall implement to fully comply with the GDPR.
KW - Data protection regulation
KW - Federated learning
KW - GDPR
KW - Personal data
KW - Privacy
KW - Privacy preservation
UR - http://www.scopus.com/inward/record.url?scp=85111509883&partnerID=8YFLogxK
U2 - 10.1016/j.cose.2021.102402
DO - 10.1016/j.cose.2021.102402
M3 - Journal article
AN - SCOPUS:85111509883
SN - 0167-4048
VL - 110
JO - Computers and Security
JF - Computers and Security
M1 - 102402
ER -