Project Details
Description
Modern applications arising from science, society and industry often involve large-scale data that create challenges for various learning tasks. With the advent of data science, an urgent need has been emerged to develop new mathematical theory to better understand the performance of numerous modern learning schemes such as distributed learning, online learning, deep learning, and many others. The goal of this project is to investigate theoretical properties of some kernel-based learning and testing schemes for large-scale data with the help of learning theory. We shall first present a general framework for double penalized kernel-based algorithms that perform simultaneously robust learning and outliers detection for contaminated data. We shall derive minimax optimal learning rates as well as outliers detection consistency for regression, pairwise learning and classification under the so-called mean-shift outlier model, respectively. We shall carry out error analysis for gradient descent algorithms associated with truncated loss functions and derive fast learning rates by choosing appropriate scaling parameter that balances the robustness and predictive performance. We shall study the spectral regularization algorithms for regression function tests in reproducing kernel Hilbert spaces. A Ward-type test statistic will be studied and optimal testing rates will be derived by employing the second order decomposition of operator differences. Regression function tests with varying Gaussian kernel will also be considered and error analysis will be conducted. Then we shall study kernel methods for goodness-of-fit tests associated with maximum mean discrepancy and spectral regularization algorithms. Finally, distributed schemes together with the above mentioned learning and testing algorithms will be considered in order to deal with large-scale data.
Status | Finished |
---|---|
Effective start/end date | 1/01/20 → 30/06/23 |