Learning Theory of Distributed Mirror Descent on Some Typical Banach Spaces and Related Topics

Project: Research project

Project Details

Description

In recent years, distributed learning (DL) has achieved remarkable success in various domains including statistical learning theory, computational optimization, networked control systems, and data mining. In real-world computations, many classical algorithms such as SGD were initially designed for single-node computation. However, in modern learning theory, due to extremely massive training data and complex neural network structure, single-processor-based algorithms are prohibitively inefficient. As a result, DL algorithms on multiple-agent (-processor) networks gradually become the mainstream of learning theory.

Distributed mirror descent (DMD) algorithm is one of the most advanced algorithms in modern distributed computation. Its flexibility lies in the crucial mirror map and associated Bregman divergence, which enables it to capture geometric information and exploit data structures such as sparsity. DMD-type methods have been developed very quickly over multi-agent networks. However, existing DMD methods are still performed on conventional Euclidean underlying space and their convergence analysis deeply relies on the Euclidean structure. In contrast, learning theory often requires data-based approximations in some underlying function classes like reproducing kernel Hilbert space (RKHS) induced by Mercer kernels or measure spaces with total variation (TV) norm that is potentially infinite-dimensional. Unfortunately, no form of DMD has been established in learning theory involving these typical Banach spaces as underlying spaces. Furthermore, little has been known about DMD-based supervised learning (SL) in RKHS. Hence, it is desirable to fully unlock the potential of DMD in learning theory.

In this project, firstly, we consider a composite functional minimization (CFM) model on some typical Banach spaces. We establish a distributed pseudo-MD method using local functional pseudo-gradients, which solves the famous positive function learning problems related to CFM. We further propose a kernel-based DMD that can be used for solving a regularized CFM problem, as well as a novel DMD approach for learning theory on measure spaces by introducing local directional derivatives. We also address SL problem by proposing a decentralized DMD kernel SL algorithm. To better deal with heavy-tailed and non-Gaussian impulse data noises, we propose a decentralized robust DMD-SL algorithm. Additionally, we extend our methods to handle distribution data. Finally, note that the current decentralized kernel learning theory deeply relies on doubly stochastic communication assumption. We fundamentally relax this assumption by establishing a novel distributed kernel learning algorithm with general directed information transmission among local processors. Rigorous convergence and approximation analysis will be conducted for these methods.
StatusActive
Effective start/end date1/01/2531/12/27

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.