A cube model and cluster analysis for web access sessions

Joshua Zhexue Huang, Michael Ng, Wai Ki Ching, Joe Ng, David Cheung

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

11 Citations (Scopus)

Abstract

Identification of the navigational patterns of casual visitors is an important step in online recommendation to convert casual visitors to customers in e-commerce. Clustering and sequential analysis are two primary techniques for mining navigational patterns from Web and application server logs. The characteristics of the log data and mining tasks require new data representation methods and analysis algorithms to be tested in the e-commerce environment. In this paper we present a cube model to represent Web access sessions for data mining. The cube model organizes session data into three dimensions. The COMPONENT dimension represents a session as a set of ordered components {c1, c2, ..., cP }, in which each component ci indexes the ith visited page in the session. Each component is associated with a set of attributes describing the page indexed by it, such as the page ID, category and view time spent at the page. The attributes associated with each component are defined in the ATTRIBUTE dimension. The SESSION dimension indexes individual sessions. In the model, irregular sessions are converted to a regular data structure to which existing data mining algorithms can be applied while the order of the page sequences is maintained. A rich set of page attributes is embedded in the model for different analysis purposes. We also present some experimental results of using the partitional clustering algorithm to cluster sessions. Because the sessions are essentially sequences of categories, the k-modes algorithm designed for clustering categorical data and the clustering method using the Markov transition frequency (or probability) matrix, are used to cluster categorical sequences.

Original languageEnglish
Title of host publicationWEBKDD 2001 - Mining Web Log Data Across All Customers Touch Points
Subtitle of host publicationThird International Workshop, San Francisco, CA, USA, August 26, 2001, Revised Papers
EditorsRon Kohavi, Brij M. Masand, Myra Spiliopoulou, Jaideep Srivastava
PublisherSpringer Berlin Heidelberg
Pages48-67
Number of pages20
Edition1st
ISBN (Electronic)9783540456407
ISBN (Print)3540439692, 9783540439691
DOIs
Publication statusPublished - 19 Jul 2002
Event3rd International Workshop on MiningWeb Log Data, WEBKDD, 2001 - San Francisco, United States
Duration: 26 Aug 200126 Aug 2001
https://link.springer.com/book/10.1007/3-540-45640-6

Publication series

NameLecture Notes in Computer Science
Volume2356
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameLecture Notes in Artificial Intelligence
ISSN (Print)2945-9133
ISSN (Electronic)2945-9141
NameWebKDD: International Workshop on Knowledge Discovery on the Web

Conference

Conference3rd International Workshop on MiningWeb Log Data, WEBKDD, 2001
Country/TerritoryUnited States
CitySan Francisco
Period26/08/0126/08/01
Internet address

Scopus Subject Areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'A cube model and cluster analysis for web access sessions'. Together they form a unique fingerprint.

Cite this