Learning latent variable models from distributed and abstracted data

Xiaofeng Zhang*, Kwok Wai CHEUNG, Chun Hung Li

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

9 Citations (Scopus)

Abstract

Discovering global knowledge from distributed data sources is challenging, where the important issues include the ever-increasing data volume at the highly distributed sources and the general concern on data privacy. Properly abstracting the distributed data with a compact representation which can retain sufficient local details for global knowledge discovery in principle can address both the scalability and the data privacy challenges. This calls for the need to develop formal methodologies to support knowledge discovery on abstracted data. In this paper, we propose to abstract distributed data as Gaussian mixture models and learn a family of generative models from the abstracted data using a modified EM algorithm. To demonstrate the effectiveness of the proposed approach, we applied it to learn (a) data cluster models and (b) data manifold models, and evaluated their performance using both synthetic and benchmark data sets with promising results in terms of both effectiveness and scalability. Also, we have demonstrated that the proposed approach is robust against heterogeneous data distributions over the distributed sources.

Original languageEnglish
Pages (from-to)2964-2988
Number of pages25
JournalInformation Sciences
Volume181
Issue number14
DOIs
Publication statusPublished - 15 Jul 2011

Scopus Subject Areas

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

User-Defined Keywords

  • Data abstraction
  • Distributed data mining
  • Gaussian mixture model
  • Generative topographic mapping
  • Model-based methods

Fingerprint

Dive into the research topics of 'Learning latent variable models from distributed and abstracted data'. Together they form a unique fingerprint.

Cite this