Abstract
In this paper we study the convergence of online gradient descent algorithms in reproducing kernel Hilbert spaces (RKHSs) without regularization. We establish a sufficient condition and a necessary condition for the convergence of excess generalization errors in expectation. A sufficient condition for the almost sure convergence is also given. With high probability, we provide explicit convergence rates of the excess generalization errors for both averaged iterates and the last iterate, which in turn also imply convergence rates with probability one. To our best knowledge, this is the first high-probability convergence rate for the last iterate of online gradient descent algorithms in the general convex setting. Without any boundedness assumptions on iterates, our results are derived by a novel use of two measures of the algorithm's one-step progress, respectively by generalization errors and by distances in RKHSs, where the variances of the involved martingales are cancelled out by the descent property of the algorithm.
Original language | English |
---|---|
Number of pages | 33 |
Journal | Journal of Machine Learning Research |
Volume | 18 |
Publication status | Published - Apr 2018 |
Scopus Subject Areas
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence
User-Defined Keywords
- Convergence analysis
- Learning theory
- Online learning
- Reproducing kernel Hilbert space