Exploiting order information embedded in ordered categories for ordinal data clustering

Yiqun Zhang, Yiu Ming Cheung*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

10 Citations (Scopus)

Abstract

As a major type of categorical data, ordinal data are those with the attributes whose possible values (also called categories interchangeably) are naturally ordered. As far as we know, all the existing distance metrics proposed for categorical data do not take the underlying order information into account during the distance measurement. This will make the produced distance incorrect and will further influence the results of ordinal data clustering. We therefore propose a specially designed distance metric, which can exploit the order information embedded in the ordered categories for distance measurement. It quantifies the distance between two ordinal categories by accumulating the sub-entropies of all the categories ordered between them. Since the proposed distance metric takes the order information into account, distance produced by it will be more reasonable than the other metrics proposed for categorical data. Moreover, it is parameter-free and can be easily applied to different ordinal data clustering tasks. Experimental results show the promising advantages of the proposed distance metric.

Original languageEnglish
Title of host publicationFoundations of Intelligent Systems
Subtitle of host publication24th International Symposium, ISMIS 2018, Limassol, Cyprus, October 29–31, 2018, Proceedings
EditorsMichelangelo Ceci, Nathalie Japkowicz, Jiming Liu, George A. Papadopoulos, Zbigniew W. Raś
Place of PublicationCham
PublisherSpringer
Pages247-257
Number of pages11
Edition1st
ISBN (Electronic)9783030018511
ISBN (Print)9783030018504
DOIs
Publication statusPublished - 7 Oct 2018
Event24th International Symposium on Methodologies for Intelligent Systems, ISMIS 2018 - Limassol, Cyprus
Duration: 29 Oct 201831 Oct 2018
https://link.springer.com/book/10.1007/978-3-030-01851-1

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume11177
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameLecture Notes in Artificial Intelligence
ISSN (Print)2945-9133
ISSN (Electronic)2945-9141
NameISMIS: International Symposium on Methodologies for Intelligent Systems

Conference

Conference24th International Symposium on Methodologies for Intelligent Systems, ISMIS 2018
Country/TerritoryCyprus
CityLimassol
Period29/10/1831/10/18
Internet address

Scopus Subject Areas

  • Theoretical Computer Science
  • Computer Science(all)

User-Defined Keywords

  • Categories
  • Clustering analysis
  • Distance metric
  • Entropy
  • Order information
  • Ordinal data

Fingerprint

Dive into the research topics of 'Exploiting order information embedded in ordered categories for ordinal data clustering'. Together they form a unique fingerprint.

Cite this