TY - JOUR
T1 - Differentially private SGD with non-smooth losses
AU - Wang, Puyu
AU - Lei, Yunwen
AU - Ying, Yiming
AU - Zhang, Hai
N1 - Funding Information:
This work was done while Puyu Wang was a visiting student at SUNY Albany. The corresponding author is Yiming Ying, whose work is supported by National Science Foundation (NSF) under grants DMS-2110836 , IIS-2110546 , IIS-1816227 , and IIS-2008532 . The work of Hai Zhang is supported by National Science Foundation of China (NSFC) under grant U1811461 .
Publisher Copyright:
© 2021 Elsevier Inc.
PY - 2022/1
Y1 - 2022/1
N2 - In this paper, we are concerned with differentially private stochastic gradient descent (SGD) algorithms in the setting of stochastic convex optimization (SCO). Most of the existing work requires the loss to be Lipschitz continuous and strongly smooth, and the model parameter to be uniformly bounded. However, these assumptions are restrictive as many popular losses violate these conditions including the hinge loss for SVM, the absolute loss in robust regression, and even the least square loss in an unbounded domain. We significantly relax these restrictive assumptions and establish privacy and generalization (utility) guarantees for private SGD algorithms using output and gradient perturbations associated with non-smooth convex losses. Specifically, the loss function is relaxed to have an α-Hölder continuous gradient (referred to as α-Hölder smoothness) which instantiates the Lipschitz continuity (α=0) and the strong smoothness (α=1). We prove that noisy SGD with α-Hölder smooth losses using gradient perturbation can guarantee (ϵ,δ)-differential privacy (DP) and attain optimal excess population risk [Formula presented], up to logarithmic terms, with the gradient complexity [Formula presented]. This shows an important trade-off between α-Hölder smoothness of the loss and the computational complexity for private SGD with statistically optimal performance. In particular, our results indicate that α-Hölder smoothness with α≥1/2 is sufficient to guarantee (ϵ,δ)-DP of noisy SGD algorithms while achieving optimal excess risk with a linear gradient complexity O(n).
AB - In this paper, we are concerned with differentially private stochastic gradient descent (SGD) algorithms in the setting of stochastic convex optimization (SCO). Most of the existing work requires the loss to be Lipschitz continuous and strongly smooth, and the model parameter to be uniformly bounded. However, these assumptions are restrictive as many popular losses violate these conditions including the hinge loss for SVM, the absolute loss in robust regression, and even the least square loss in an unbounded domain. We significantly relax these restrictive assumptions and establish privacy and generalization (utility) guarantees for private SGD algorithms using output and gradient perturbations associated with non-smooth convex losses. Specifically, the loss function is relaxed to have an α-Hölder continuous gradient (referred to as α-Hölder smoothness) which instantiates the Lipschitz continuity (α=0) and the strong smoothness (α=1). We prove that noisy SGD with α-Hölder smooth losses using gradient perturbation can guarantee (ϵ,δ)-differential privacy (DP) and attain optimal excess population risk [Formula presented], up to logarithmic terms, with the gradient complexity [Formula presented]. This shows an important trade-off between α-Hölder smoothness of the loss and the computational complexity for private SGD with statistically optimal performance. In particular, our results indicate that α-Hölder smoothness with α≥1/2 is sufficient to guarantee (ϵ,δ)-DP of noisy SGD algorithms while achieving optimal excess risk with a linear gradient complexity O(n).
KW - Algorithmic stability
KW - Differential privacy
KW - Generalization
KW - Stochastic gradient descent
UR - http://www.scopus.com/inward/record.url?scp=85116046185&partnerID=8YFLogxK
U2 - 10.1016/j.acha.2021.09.001
DO - 10.1016/j.acha.2021.09.001
M3 - Journal article
AN - SCOPUS:85116046185
SN - 1063-5203
VL - 56
SP - 306
EP - 336
JO - Applied and Computational Harmonic Analysis
JF - Applied and Computational Harmonic Analysis
ER -