Abstract
In this paper, we propose a generative model named the author-topic-community (ATC) model for representing a corpus of linked documents. The ATC model allows each author to be associated with a topic distribution and a community distribution as its model parameters. A learning algorithm based on variational inference is derived for the model parameter estimation where the two distributions are essentially reinforcing each other during the estimation. We compare the performance of the ATC model with two related generative models using first synthetic data sets and then real data sets, which include a research community data set, a blog data set, a news-sharing data set, and a microblogging data set. The empirical results obtained confirm that the proposed ATC model outperforms the existing models for tasks such as author interest profiling and author community discovery. We also demonstrate how the inferred ATC model can be used to characterize the roles of users/authors in online communities.
Original language | English |
---|---|
Pages (from-to) | 359-383 |
Number of pages | 25 |
Journal | Knowledge and Information Systems |
Volume | 44 |
Issue number | 2 |
DOIs | |
Publication status | Published - 22 Aug 2015 |
Scopus Subject Areas
- Software
- Information Systems
- Human-Computer Interaction
- Hardware and Architecture
- Artificial Intelligence
User-Defined Keywords
- Author community discovery
- Author interest profiling
- Graphical models
- Variational inference