Sparse PCA via matrix (2,1)-norm regularization with an application to feature selection

  • Li Wang
  • , Jiawei Wang
  • , Ren Cang Li*
  • *Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

This paper is concerned with sparse PCA via the matrix (2,1)-norm regularization (PCA2,1). It can produce a row-sparse projection, a useful tool in machine learning when it comes to, for example, feature selection, that aims to choose most relevant features. Mathematically, PCA2,1 is a non-smooth optimization problem on the Stiefel manifold. For a suitably chosen regularization parameter, the optimal projection matrix has many negligible rows. A practical NEPv approach (nonlinear eigenvalue problem with eigenvector dependency) is proposed to iteratively compute the optimal projection matrix. It is shown that the approach is globally convergent in the sense that the objective is monotonically increasing during the iterative process and any accumulation point of the iterates is a stationary point to the optimization problem. Extensive numerical experiments, with an application to feature selection, have been conducted to demonstrate the performance of the practical NEPv approach, with comparison against existing feature selection methods in terms of classification accuracy. The numerical results demonstrate that PCA2,1 is highly effective and often produces superior classification results to existing feature selection methods that are in use today.

Original languageEnglish
Article number100676
Number of pages16
JournalResults in Applied Mathematics
Volume28
DOIs
Publication statusPublished - Nov 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 4 - Quality Education
    SDG 4 Quality Education

User-Defined Keywords

  • Feature selection
  • Matrix (2,1)-norm regularization
  • NEPv
  • PCA
  • Row-sparse projection
  • SCF

Fingerprint

Dive into the research topics of 'Sparse PCA via matrix (2,1)-norm regularization with an application to feature selection'. Together they form a unique fingerprint.

Cite this