TY - JOUR
T1 - Deep learning theory of distribution regression with CNNs
AU - Yu, Zhan
AU - Zhou, Ding Xuan
N1 - Funding information:
The first version of the paper was written when the second author worked at City University of Hong Kong, supported partially by the Laboratory for AI-Powered Financial Technologies, the Research Grants Council of Hong Kong [Projects # C1013-21GF and #11308121], the Germany/Hong Kong Joint Research Scheme [Project No. G-CityU101/20], the CityU Strategic Interdisciplinary Research Grant [Project No. 7020010], National Science Foundation of China [Project No. 12061160462], and Hong Kong Institute for Data Science. The first author would like to thank Zhongjie Shi for nice communications and discussions on related topics. The authors would like to thank the anonymous referee for his/her careful review which helps improve the quality of the paper.
Publisher copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
PY - 2023/8
Y1 - 2023/8
N2 - We establish a deep learning theory for distribution regression with deep convolutional neural networks (DCNNs). Deep learning based on structured deep neural networks has been powerful in practical applications. Generalization analysis for regression with DCNNs has been carried out very recently. However, for the distribution regression problem in which the input variables are probability measures, there is no mathematical model or theoretical analysis of DCNN-based learning theory. One of the difficulties is that the classical neural network structure requires the input variable to be a Euclidean vector. When the input samples are probability distributions, the traditional neural network structure cannot be directly used. A well-defined DCNN framework for distribution regression is desirable. In this paper, we overcome the difficulty and establish a novel DCNN-based learning theory for a two-stage distribution regression model. Firstly, we realize an approximation theory for functionals defined on the set of Borel probability measures with the proposed DCNN framework. Then, we show that the hypothesis space is well-defined by rigorously proving its compactness. Furthermore, in the hypothesis space induced by the general DCNN framework with distribution inputs, by using a two-stage error decomposition technique, we derive a novel DCNN-based two-stage oracle inequality and optimal learning rates (up to a logarithmic factor) for the proposed algorithm for distribution regression.
AB - We establish a deep learning theory for distribution regression with deep convolutional neural networks (DCNNs). Deep learning based on structured deep neural networks has been powerful in practical applications. Generalization analysis for regression with DCNNs has been carried out very recently. However, for the distribution regression problem in which the input variables are probability measures, there is no mathematical model or theoretical analysis of DCNN-based learning theory. One of the difficulties is that the classical neural network structure requires the input variable to be a Euclidean vector. When the input samples are probability distributions, the traditional neural network structure cannot be directly used. A well-defined DCNN framework for distribution regression is desirable. In this paper, we overcome the difficulty and establish a novel DCNN-based learning theory for a two-stage distribution regression model. Firstly, we realize an approximation theory for functionals defined on the set of Borel probability measures with the proposed DCNN framework. Then, we show that the hypothesis space is well-defined by rigorously proving its compactness. Furthermore, in the hypothesis space induced by the general DCNN framework with distribution inputs, by using a two-stage error decomposition technique, we derive a novel DCNN-based two-stage oracle inequality and optimal learning rates (up to a logarithmic factor) for the proposed algorithm for distribution regression.
KW - Deep CNN
KW - Deep learning
KW - Distribution regression
KW - Learning theory
KW - Oracle inequality
KW - ReLU
UR - http://www.scopus.com/inward/record.url?scp=85164177608&partnerID=8YFLogxK
U2 - 10.1007/s10444-023-10054-y
DO - 10.1007/s10444-023-10054-y
M3 - Journal article
SN - 1019-7168
VL - 49
JO - Advances in Computational Mathematics
JF - Advances in Computational Mathematics
IS - 4
M1 - 51
ER -