IPS: Instance Profile for Shapelet Discovery for Time Series Classification

Guozhong Li*, Byron Koon Kau Choi, Jianliang Xu, Sourav S. Bhowmick, Daphne Ngar-yin Mah, Grace Lai Hung Wong

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference contributionpeer-review

Abstract

Time series classification (TSC) has been one of the most fundamental problems of time series data. Time series shapelets (or simply, shapelets) are discriminative subsequences that have been recently found both effective and interpretable for solving TSC. However, shapelet discovery is known to be computationally costly. Meanwhile, matrix profile has been recently proposed for efficient motif discovery and anomaly detection. Our preliminary experiment shows that a direct adoption of the matrix profile on TSC does not bring superior classification accuracy. We have identified two main issues of such an adoption: 1) discords as “shapelets”, and 2) lack of shapelet diversity. In response to these issues, we propose instance profile for shapelets, called IPS, for shapelet discovery for TSC. The main challenge is to utilize the instance profile (IP) to capture the characteristics of shapelets in a robust manner and then to discover high-quality shapelets efficiently. First, we use our IP to generate abundant shapelet candidates. We next efficiently prune candidates that do not align with the definition of shapelets using a novel distribution-aware bloom filter (DABF). Three utility functions are proposed to measure the shapelet candidates and DABF is used to efficiently compute the functions. We have conducted comprehensive experiments on IPS with 12 competitive state-of-the-art methods using UCR Archive datasets. The efficiency is on average 25 times faster than that of BSPCOVER (the current state-of-the-art method). The accuracy of IPS is comparable to or higher than that of existing work. Furthermore, we select one case study to illustrate the interpretability of the shapelets.
Original languageEnglish
Title of host publicationProceedings - 2022 IEEE 38th International Conference on Data Engineering, ICDE 2022
PublisherIEEE
Pages1781-1793
Number of pages13
ISBN (Electronic)9781665408837
ISBN (Print)9781665408844
DOIs
Publication statusPublished - May 2022
Event38th IEEE International Conference on Data Engineering, ICDE 2022 - Virtual, Kuala Lumpur, Malaysia
Duration: 9 May 202212 May 2022
https://icde2022.ieeecomputer.my/
https://ieeexplore.ieee.org/xpl/conhome/9835153/proceeding

Publication series

NameProceedings of IEEE International Conference on Data Engineering (ICDE)
ISSN (Print)1063-6382
ISSN (Electronic)2375-026X

Conference

Conference38th IEEE International Conference on Data Engineering, ICDE 2022
Country/TerritoryMalaysia
CityKuala Lumpur
Period9/05/2212/05/22
Internet address

User-Defined Keywords

  • Time series classification
  • Instance profile
  • Distribution-aware bloom filter
  • Efficiency
  • Accuracy

Fingerprint

Dive into the research topics of 'IPS: Instance Profile for Shapelet Discovery for Time Series Classification'. Together they form a unique fingerprint.

Cite this