Abstract
Automatic Web page organization and visualization is an effective way for foraging information in a Web structure. Web pages contain both text (content) and links (structure), implying that content and structure analysis techniques should be adopted and properly integrated. In this paper, we take the probabilistic model-based approach and extend a topographypreserving model known as Generative Topography Map (GTM). The extended GTM provides a principled way to integrate Web pages and hyperlinks and project them into a low-dimension latent space (2D in our case) for visualization. The proposed extension has been applied to the WebKB dataset. Based on the preliminary results obtained, we proposed several directions for future research.
Original language | English |
---|---|
Title of host publication | SRL2004: ICML 2004 Workshop on Statistical Relational Learning and its Connections to Other Fields. Accepted Papers |
Pages | 126-131 |
Number of pages | 6 |
Publication status | Published - Jul 2004 |
Event | ICML 2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (SRL 2004) - Banff, Canada Duration: 4 Jul 2004 → 8 Jul 2004 |
Conference
Conference | ICML 2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (SRL 2004) |
---|---|
Period | 4/07/04 → 8/07/04 |
User-Defined Keywords
- Web page organization and visualization
- Web content and structure analysis
- Generative Topography Map