TY - GEN
T1 - Generating synthetic graphs for large sensitive and correlated social networks
AU - Ju, Xin
AU - Zhang, Xiaofeng
AU - CHEUNG, Kwok Wai
N1 - Funding Information:
This paper is partially supported by the National Science Foundation of China under grant No.61872108, and Shenzhen Science and Technology Program under Grant No.JCYJ20170811153507788.
PY - 2019/4
Y1 - 2019/4
N2 - With the fast development of social networks, there exists a huge amount of users information as well as their social ties. Such information generally contains sensitive and correlated users' personal data. How to accurately analyze this large and correlated social graph data while protecting users' privacy has become a challenging research issue. In the literature, various research efforts have been made on this topic. Unfortunately, correlation based privacy protection techniques for social graph data have seldom been investigated. To the best of our knowledge, this paper is the first attempt to resolve this research issue. Particularly, this paper first protects users' raw data via local differential privacy technique and then a correlation based privacy protection approach is designed. Last, a K-means algorithm is applied on the perturbed local data and the clustering results are used to generate the synthetic graphs for further data analytics. Experiments are evaluated on two real-world data sets, i.e. Facebook dataset and Enron dataset, and the promising experimental results demonstrate that the proposed approach is superior to the state-of-the-art LDPGen and the baseline method, e.g. the DGG, with respect to the accuracy and utility evaluation criteria.
AB - With the fast development of social networks, there exists a huge amount of users information as well as their social ties. Such information generally contains sensitive and correlated users' personal data. How to accurately analyze this large and correlated social graph data while protecting users' privacy has become a challenging research issue. In the literature, various research efforts have been made on this topic. Unfortunately, correlation based privacy protection techniques for social graph data have seldom been investigated. To the best of our knowledge, this paper is the first attempt to resolve this research issue. Particularly, this paper first protects users' raw data via local differential privacy technique and then a correlation based privacy protection approach is designed. Last, a K-means algorithm is applied on the perturbed local data and the clustering results are used to generate the synthetic graphs for further data analytics. Experiments are evaluated on two real-world data sets, i.e. Facebook dataset and Enron dataset, and the promising experimental results demonstrate that the proposed approach is superior to the state-of-the-art LDPGen and the baseline method, e.g. the DGG, with respect to the accuracy and utility evaluation criteria.
KW - Local Differential Privacy
KW - Social Graph
KW - Social Network Analysis
UR - http://www.scopus.com/inward/record.url?scp=85069217992&partnerID=8YFLogxK
U2 - 10.1109/ICDEW.2019.00007
DO - 10.1109/ICDEW.2019.00007
M3 - Conference proceeding
AN - SCOPUS:85069217992
T3 - Proceedings - 2019 IEEE 35th International Conference on Data Engineering Workshops, ICDEW 2019
SP - 286
EP - 293
BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering Workshops, ICDEW 2019
PB - IEEE
T2 - 35th IEEE International Conference on Data Engineering Workshops, ICDEW 2019
Y2 - 8 April 2019 through 12 April 2019
ER -