Abstract
We study stochastic composite mirror descent, a class of scalable algorithms able to exploit the geometry and composite structure of a problem. We consider both convex and strongly convex objectives with non-smooth loss functions, for each of which we establish high-probability convergence rates optimal up to a logarithmic factor. We apply the derived computational error bounds to study the generalization performance of multi-pass stochastic gradient descent (SGD) in a non-parametric setting. Our high-probability generalization bounds enjoy a logarithmical dependency on the number of passes provided that the step size sequence is square-summable, which improves the existing bounds in expectation with a polynomial dependency and therefore gives a strong justification on the ability of multi-pass SGD to overcome overfitting. Our analysis removes boundedness assumptions on subgradients often imposed in the literature. Numerical results are reported to support our theoretical findings.
Original language | English |
---|---|
Title of host publication | Advances in Neural Information Processing Systems 31 |
Editors | S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett |
Pages | 1519-1529 |
Number of pages | 11 |
Volume | 31 |
ISBN (Electronic) | 9781510884472 |
Publication status | Published - Dec 2018 |
Event | 32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada Duration: 2 Dec 2018 → 8 Dec 2018 https://neurips.cc/Conferences/2018 https://proceedings.neurips.cc/paper/2018 |
Conference
Conference | 32nd Conference on Neural Information Processing Systems, NeurIPS 2018 |
---|---|
Country/Territory | Canada |
City | Montreal |
Period | 2/12/18 → 8/12/18 |
Internet address |
Scopus Subject Areas
- Computer Networks and Communications
- Information Systems
- Signal Processing