Abstract
With the increasing prevalence of information networks, research on privacy-preserving network data publishing has received substantial attention recently. There are two streams of relevant research, targeting different privacy requirements. A large body of existing works focus on preventing node re-identification against adversaries with structural background knowledge, while some other studies aim to thwart edge disclosure. In general, the line of research on preventing edge disclosure is less fruitful, largely due to lack of a formal privacy model. The recent emergence of differential privacy has shown great promise for rigorous prevention of edge disclosure. Yet recent research indicates that differential privacy is vulnerable to data correlation, which hinders its application to network data that may be inherently correlated. In this paper, we show that differential privacy could be tuned to provide provable privacy guarantees even in the correlated setting by introducing an extra parameter, which measures the extent of correlation. We subsequently provide a holistic solution for non-interactive network data publication. First, we generate a private vertex labeling for a given network dataset to make the corresponding adjacency matrix form dense clusters. Next, we adaptively identify dense regions of the adjacency matrix by a data-dependent partitioning process. Finally, we reconstruct a noisy adjacency matrix by a novel use of the exponential mechanism. To our best knowledge, this is the first work providing a practical solution for publishing real-life network data via differential privacy. Extensive experiments demonstrate that our approach performs well on different types of real-life network datasets.
Original language | English |
---|---|
Pages (from-to) | 653-676 |
Number of pages | 24 |
Journal | VLDB Journal |
Volume | 23 |
Issue number | 4 |
DOIs | |
Publication status | Published - Aug 2014 |
Scopus Subject Areas
- Information Systems
- Hardware and Architecture
User-Defined Keywords
- Data correlation
- Differential privacy
- Network data
- Non-interactive publication