TY - JOUR
T1 - High-probability generalization bounds for pointwise uniformly stable algorithms
AU - Fan, Jun
AU - Lei, Yunwen
N1 - The work by Jun Fan is partially supported by the Research Grants Council of Hong Kong [Project No. HKBU 12302819] and [Project No. HKBU 12303220], and Hong Kong Baptist University [Project No. RC-FNRA-IG/22-23/SCI/02 ]. The work by Yunwen Lei is partially supported by the Research Grants Council of Hong Kong [Project No. 22303723] and URC Seed Fund for Basic Research for New Staff 2023-24.
Publisher Copyright:
© 2024 Elsevier Inc.
PY - 2024/5
Y1 - 2024/5
N2 - Algorithmic stability is a fundamental concept in statistical learning theory to understand the generalization behavior of optimization algorithms. Existing high-probability bounds are developed for the generalization gap as measured by function values and require the algorithm to be uniformly stable. In this paper, we introduce a novel stability measure called pointwise uniform stability by considering the sensitivity of the algorithm with respect to the perturbation of each training example. We show this weaker pointwise uniform stability guarantees almost optimal bounds, and gives the first high-probability bound for the generalization gap as measured by gradients. Sharper bounds are given for strongly convex and smooth problems. We further apply our general result to derive improved generalization bounds for stochastic gradient descent. As a byproduct, we develop concentration inequalities for a summation of weakly-dependent vector-valued random variables.
AB - Algorithmic stability is a fundamental concept in statistical learning theory to understand the generalization behavior of optimization algorithms. Existing high-probability bounds are developed for the generalization gap as measured by function values and require the algorithm to be uniformly stable. In this paper, we introduce a novel stability measure called pointwise uniform stability by considering the sensitivity of the algorithm with respect to the perturbation of each training example. We show this weaker pointwise uniform stability guarantees almost optimal bounds, and gives the first high-probability bound for the generalization gap as measured by gradients. Sharper bounds are given for strongly convex and smooth problems. We further apply our general result to derive improved generalization bounds for stochastic gradient descent. As a byproduct, we develop concentration inequalities for a summation of weakly-dependent vector-valued random variables.
KW - Algorithmic stability
KW - Generalization analysis
KW - Learning theory
KW - Stochastic gradient descent
UR - http://www.scopus.com/inward/record.url?scp=85183943603&partnerID=8YFLogxK
U2 - 10.1016/j.acha.2024.101632
DO - 10.1016/j.acha.2024.101632
M3 - Journal article
AN - SCOPUS:85183943603
SN - 1063-5203
VL - 70
JO - Applied and Computational Harmonic Analysis
JF - Applied and Computational Harmonic Analysis
M1 - 101632
ER -