Abstract
We study stochastic composite mirror descent, a class of scalable algorithms able to exploit the geometry and composite structure of a problem. We consider both convex and strongly convex objectives with non-smooth loss functions, for each of which we establish high-probability convergence rates optimal up to a logarithmic factor. We apply the derived computational error bounds to study the generalization performance of multi-pass stochastic gradient descent (SGD) in a non-parametric setting. Our high-probability generalization bounds enjoy a logarithmical dependency on the number of passes provided that the step size sequence is square-summable, which improves the existing bounds in expectation with a polynomial dependency and therefore gives a strong justification on the ability of multi-pass SGD to overcome overfitting. Our analysis removes boundedness assumptions on subgradients often imposed in the literature. Numerical results are reported to support our theoretical findings.
| Original language | English |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 31 |
| Editors | S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett |
| Pages | 1519-1529 |
| Number of pages | 11 |
| Volume | 31 |
| ISBN (Electronic) | 9781510884472 |
| Publication status | Published - Dec 2018 |
| Event | 32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada Duration: 2 Dec 2018 → 8 Dec 2018 https://neurips.cc/Conferences/2018 https://proceedings.neurips.cc/paper/2018 |
Conference
| Conference | 32nd Conference on Neural Information Processing Systems, NeurIPS 2018 |
|---|---|
| Country/Territory | Canada |
| City | Montreal |
| Period | 2/12/18 → 8/12/18 |
| Internet address |