Fine-Grained analysis of stability and generalization for stochastic gradient descent

Yunwen Lei, Yiming Ying*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

26 Citations (Scopus)

Abstract

Recently there are a considerable amount of work devoted to the study of the algorithmic stability and generalization for stochastic gradient descent (SGD). However, the existing stability analysis requires to impose restrictive assumptions on the boundedness of gradients, smoothness and con_vexity of loss functions. In this paper, we provide a fine-grained analysis of stability and general_ization for SGD by substantially relaxing these assumptions. Firstly, we establish stability and generalization for SGD by removing the existing bounded gradient assumptions. The key idea is the introduction of a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iter_ates. This yields generalization bounds depend_ing on the behavior of the best model, and leads to the first-ever-known fast bounds in the low_noise setting using stability approach. Secondly, the smoothness assumption is relaxed by con_sidering loss functions with Holder continuous (sub)gradients for which we show that optimal bounds are still achieved by balancing computa_tion and stability. To our best knowledge, this gives the first-ever-known stability and generaliza_tion bounds for SGD with non-smooth loss func_tions (e.g., hinge loss). Finally, we study learning problems with (strongly) convex objectives but non-convex loss functions.

Original languageEnglish
Title of host publicationProceedings of the 37th International Conference on Machine Learning, ICML 2020
EditorsHal Daumé III, Aarti Singh
PublisherML Research Press
Pages5809-5819
Number of pages11
ISBN (Electronic)9781713821120
Publication statusPublished - Jul 2020
Event37th International Conference on Machine Learning, ICML 2020 - Virtual, Online
Duration: 13 Jul 202018 Jul 2020
https://proceedings.mlr.press/v119/

Publication series

NameProceedings of Machine Learning Research
Volume119
ISSN (Print)2640-3498

Conference

Conference37th International Conference on Machine Learning, ICML 2020
Period13/07/2018/07/20
Internet address

Scopus Subject Areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Fingerprint

Dive into the research topics of 'Fine-Grained analysis of stability and generalization for stochastic gradient descent'. Together they form a unique fingerprint.

Cite this