Analyzing Deep PAC-Bayesian Learning with Neural Tangent Kernel: Convergence, Analytic Generalization Bound, and Efficient Hyperparameter Selection

Wei Huang, Chunrui Liu, Yilan Chen, Richard Yi Da Xu, Miao Zhang, Tsui Wei Weng

Research output: Contribution to journalJournal articlepeer-review

6 Citations (Scopus)

Abstract

PAC-Bayes is a well-established framework for analyzing generalization performance in machine learning models. This framework provides a bound on the expected population error by considering the sum of training error and the divergence between posterior and prior distributions. In addition to being a successful generalization bound analysis tool, the PAC-Bayesian bound can also be incorporated into an objective function for training probabilistic neural networks, which we refer to simply as Deep PAC-Bayesian Learning. Deep PAC-Bayesian learning has been shown to achieve competitive expected test set error and provide a tight generalization bound in practice at the same time through gradient descent training. Despite its empirical success, theoretical analysis of deep PAC-Bayesian learning for neural networks is rarely explored. To this end, this paper proposes a theoretical convergence and generalization analysis for Deep PAC-Bayesian learning. For a deep and wide probabilistic neural network, our analysis shows that PAC-Bayesian learning corresponds to solving a kernel ridge regression when the probabilistic neural tangent kernel (PNTK) is used as the kernel. We utilize this outcome in conjunction with the PAC-Bayes C-bound, enabling us to derive an analytical and guaranteed PAC-Bayesian generalization bound for the first time. Finally, drawing insight from our theoretical results, we propose a proxy measure for efficient hyperparameter selection, which is proven to be time-saving on various benchmarks. Our work not only provides a better understanding of the theoretical underpinnings of Deep PAC-Bayesian learning, but also offers practical tools for improving the training and generalization performance of these models.
Original languageEnglish
Number of pages30
JournalTransactions on Machine Learning Research
Publication statusPublished - May 2023

Fingerprint

Dive into the research topics of 'Analyzing Deep PAC-Bayesian Learning with Neural Tangent Kernel: Convergence, Analytic Generalization Bound, and Efficient Hyperparameter Selection'. Together they form a unique fingerprint.

Cite this