Abstract
Most of existing statistical theories on deep neural networks have sample complexities cursed by the data dimension and therefore cannot well explain the empirical success of deep learning on high-dimensional data. To bridge this gap, we propose to exploit the low-dimensional structures of the real world datasets and establish theoretical guarantees of convolutional residual networks (ConvResNet) in terms of function approximation and statistical recovery for binary classification problem. Specifically, given the data lying on a d-dimensional manifold isometrically embedded in RD, we prove that if the network architecture is properly chosen, ConvResNets can (1) approximate {\it Besov functions} on manifolds with arbitrary accuracy, and (2) learn a classifier by minimizing the empirical logistic risk, which gives an {\it excess risk} in the order of n−(s/2s+2(s∨d)), where s is a smoothness parameter. This implies that the sample complexity depends on the intrinsic dimension d, instead of the data dimension D. Our results demonstrate that ConvResNets are adaptive to low-dimensional structures of data sets.
Original language | English |
---|---|
Title of host publication | Proceedings of the 38th International Conference on Machine Learning (ICML 2021) |
Editors | Marina Meila, Tong Zhang |
Publisher | ML Research Press |
Pages | 6770-6780 |
Number of pages | 11 |
Publication status | Published - 18 Jul 2021 |
Event | 38th International Conference on Machine Learning, ICML 2021 - Virtual Duration: 18 Jul 2021 → 24 Jul 2021 https://icml.cc/virtual/2021/index.html https://icml.cc/Conferences/2021 https://proceedings.mlr.press/v139/ |
Publication series
Name | Proceedings of Machine Learning Research |
---|---|
Volume | 139 |
ISSN (Print) | 2640-3498 |
Conference
Conference | 38th International Conference on Machine Learning, ICML 2021 |
---|---|
Period | 18/07/21 → 24/07/21 |
Internet address |