TY - GEN

T1 - Differentially private network data release via structural inference

AU - Xiao, Qian

AU - CHEN, Rui

AU - Tan, Kian Lee

N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.

PY - 2014

Y1 - 2014

N2 - Information networks, such as social media and email networks, often contain sensitive information. Releasing such network data could seriously jeopardize individual privacy. Therefore, we need to sanitize network data before the release. In this paper, we present a novel data sanitization solution that infers a network's structure in a differentially private manner. We observe that, by estimating the connection probabilities between vertices instead of considering the observed edges directly, the noise scale enforced by differential privacy can be greatly reduced. Our proposed method infers the network structure by using a statistical hierarchical random graph (HRG) model. The guarantee of differential privacy is achieved by sampling possible HRG structures in the model space via Markov chain Monte Carlo (MCMC). We theoretically prove that the sensitivity of such inference is only O(log n), where n is the number of vertices in a network. This bound implies less noise to be injected than those of existing works. We experimentally evaluate our approach on four real-life network datasets and show that our solution effectively preserves essential network structural properties like degree distribution, shortest path length distribution and influential nodes.

AB - Information networks, such as social media and email networks, often contain sensitive information. Releasing such network data could seriously jeopardize individual privacy. Therefore, we need to sanitize network data before the release. In this paper, we present a novel data sanitization solution that infers a network's structure in a differentially private manner. We observe that, by estimating the connection probabilities between vertices instead of considering the observed edges directly, the noise scale enforced by differential privacy can be greatly reduced. Our proposed method infers the network structure by using a statistical hierarchical random graph (HRG) model. The guarantee of differential privacy is achieved by sampling possible HRG structures in the model space via Markov chain Monte Carlo (MCMC). We theoretically prove that the sensitivity of such inference is only O(log n), where n is the number of vertices in a network. This bound implies less noise to be injected than those of existing works. We experimentally evaluate our approach on four real-life network datasets and show that our solution effectively preserves essential network structural properties like degree distribution, shortest path length distribution and influential nodes.

KW - differential privacy

KW - network data

KW - structural inference

UR - http://www.scopus.com/inward/record.url?scp=84907033666&partnerID=8YFLogxK

U2 - 10.1145/2623330.2623642

DO - 10.1145/2623330.2623642

M3 - Conference proceeding

AN - SCOPUS:84907033666

SN - 9781450329569

T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

SP - 911

EP - 920

BT - KDD 2014 - Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery (ACM)

T2 - 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2014

Y2 - 24 August 2014 through 27 August 2014

ER -