Exploiting order information embedded in ordered categories for ordinal data clustering

Yiqun Zhang, Yiu Ming CHEUNG*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

As a major type of categorical data, ordinal data are those with the attributes whose possible values (also called categories interchangeably) are naturally ordered. As far as we know, all the existing distance metrics proposed for categorical data do not take the underlying order information into account during the distance measurement. This will make the produced distance incorrect and will further influence the results of ordinal data clustering. We therefore propose a specially designed distance metric, which can exploit the order information embedded in the ordered categories for distance measurement. It quantifies the distance between two ordinal categories by accumulating the sub-entropies of all the categories ordered between them. Since the proposed distance metric takes the order information into account, distance produced by it will be more reasonable than the other metrics proposed for categorical data. Moreover, it is parameter-free and can be easily applied to different ordinal data clustering tasks. Experimental results show the promising advantages of the proposed distance metric.

Original languageEnglish
Title of host publicationFoundations of Intelligent Systems - 24th International Symposium, ISMIS 2018, Proceedings
EditorsNathalie Japkowicz, George A. Papadopoulos, Michelangelo Ceci, Zbigniew W. Ras, Jiming Liu
PublisherSpringer Verlag
Pages247-257
Number of pages11
ISBN (Print)9783030018504
DOIs
Publication statusPublished - 2018
Event24th International Symposium on Methodologies for Intelligent Systems, ISMIS 2018 - Limassol, Cyprus
Duration: 29 Oct 201831 Oct 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11177 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Symposium on Methodologies for Intelligent Systems, ISMIS 2018
Country/TerritoryCyprus
CityLimassol
Period29/10/1831/10/18

Scopus Subject Areas

  • Theoretical Computer Science
  • Computer Science(all)

User-Defined Keywords

  • Categories
  • Clustering analysis
  • Distance metric
  • Entropy
  • Order information
  • Ordinal data

Fingerprint

Dive into the research topics of 'Exploiting order information embedded in ordered categories for ordinal data clustering'. Together they form a unique fingerprint.

Cite this