Dynamic time warping for music retrieval using time series modeling of musical emotions

James J. Deng*, Clement H.C. Leung

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

36 Citations (Scopus)


Musical signals have rich temporal information not only at the physical level but at the emotion level. The listeners may wish to find music excerpts that have similar sequence patterns of musical emotions with given excerpts. Most state-of-the-art systems for emotion-based music retrieval concentrate on static analysis of musical emotions, and ignore dynamic analysis and modeling of musical emotions over time. This paper presents a novel approach to perform music retrieval based on time-varying musical emotion dynamics. A three-dimensional musical emotion model - Resonance-Arousal-Valence (RAV) - is used, and emotions of a piece of music are represented by musical emotion dynamics in a time series. A multiple dynamic textures (MDT) model is proposed to model music and emotion dynamics over time, and expectation maximization (EM) algorithm along with Kalman filtering and smoothing is used to estimate model parameters. Two smoothing methods - Rauch-Tung-Striebel (RTS) and minimum-variance smoothing (MVS) - to robust model are investigated and compared to find an optimal solution to enhance prediction. To find similar sequence patterns of musical emotions, subsequence dynamic time warping (DTW) for emotion dynamics matching is presented. Experimental results demonstrate the benefits of MDT to predict time-varying musical emotions, and our proposed method for music retrieval based on emotion dynamics outperforms retrieval methods based on acoustic features.

Original languageEnglish
Article number7042773
Pages (from-to)137-151
Number of pages15
JournalIEEE Transactions on Affective Computing
Issue number2
Publication statusPublished - 1 Apr 2015

Scopus Subject Areas

  • Software
  • Human-Computer Interaction

User-Defined Keywords

  • dynamic time warping
  • EM algorithm
  • Kalman filter and smoother
  • multiple dynamic textures
  • Musical emotion


Dive into the research topics of 'Dynamic time warping for music retrieval using time series modeling of musical emotions'. Together they form a unique fingerprint.

Cite this