Robust estimation in regression and classification methods for large dimensional data

Chunming Zhang*, Lixing Zhu, Yanbo Shen

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

1 Citation (Scopus)

Abstract

Statistical data analysis and machine learning heavily rely on error measures for regression, classification, and forecasting. Bregman divergence (BD) is a widely used family of error measures, but it is not robust to outlying observations or high leverage points in large- and high-dimensional datasets. In this paper, we propose a new family of robust Bregman divergences called “robust- BD ” that are less sensitive to data outliers. We explore their suitability for sparse large-dimensional regression models with incompletely specified response variable distributions and propose a new estimate called the “penalized robust- BD estimate” that achieves the same oracle property as ordinary non-robust penalized least-squares and penalized-likelihood estimates. We conduct extensive numerical experiments to evaluate the performance of the proposed penalized robust- BD estimate and compare it with classical approaches, and show that our proposed method improves on existing approaches. Finally, we analyze a real dataset to illustrate the practicality of our proposed method. Our findings suggest that the proposed method can be a useful tool for robust statistical data analysis and machine learning in the presence of outliers and large-dimensional data.

Original languageEnglish
Pages (from-to)3361-3411
Number of pages51
JournalMachine Learning
Volume112
Issue number9
Early online date5 Jul 2023
DOIs
Publication statusPublished - Sept 2023

Scopus Subject Areas

  • Software
  • Artificial Intelligence

User-Defined Keywords

  • Coordinate descent
  • Hypothesis testing
  • Loss functions
  • Penalization
  • Quasi-likelihood

Fingerprint

Dive into the research topics of 'Robust estimation in regression and classification methods for large dimensional data'. Together they form a unique fingerprint.

Cite this