A structure noise-aware tensor dictionary learning method for high-dimensional data clustering

Jing Hua Yang, Chuan Chen*, Hong Ning Dai*, Le Le Fu, Zibin Zheng

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

7 Citations (Scopus)

Abstract

With the development of data acquisition technology, high-dimensional data clustering is an important yet challenging task in data mining. Despite advances achieved by current clustering methods, they can be further improved. First, many of them usually unfold the high-dimensional data into a large matrix, consequently resulting in destroying the intrinsic structural property. Second, some methods assume that the noise in the dataset conforms to a predefined distribution (e.g., the Gaussian or Laplacian distribution), which violates real-world applications and eventually decreases the clustering performance. To address these issues, in this paper, we propose a novel tensor dictionary learning method for clustering high-dimensional data with the coexistence of structure noise. We adopt tensors, the natural and powerful tools for the generalizations of vectors and matrices, to characterize high-dimensional data. Meanwhile, to depict the noise accurately, we decompose the observed data into clean data, structure noise, and Gaussian noise. Furthermore, we use low-rank tensor modeling to characterize the inherent correlations of clean data and adopt tensor dictionary learning to adaptively and accurately describe the structure noise instead of using the predefined distribution. We design the proximal alternating minimization algorithm to solve the proposed model with the theoretical convergence guarantee. Experimental results on both simulated and real datasets show that the proposed method outperforms the compared methods for high-dimensional data clustering.

Original languageEnglish
Pages (from-to)87-106
Number of pages20
JournalInformation Sciences
Volume612
DOIs
Publication statusPublished - Oct 2022

Scopus Subject Areas

  • Theoretical Computer Science
  • Software
  • Control and Systems Engineering
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

User-Defined Keywords

  • High-dimensional data clustering
  • Proximal alternating minimization
  • Structural sparsity
  • Structure noise
  • Tensor dictionary learning
  • Tensor low-rank representation

Fingerprint

Dive into the research topics of 'A structure noise-aware tensor dictionary learning method for high-dimensional data clustering'. Together they form a unique fingerprint.

Cite this