Locally learning heterogeneous manifolds for phonetic classification

Heyun Huang, Yang Liu, Louis Ten Bosch*, Bert Cranen, Lou Boves

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

6 Citations (Scopus)

Abstract

Most state-of-the-art phone classifiers use the same features and decision criteria for all phones, despite the fact that different broad classes are characterized by different manners and place of articulation that result in different acoustic features. This paper uses manifold learning to address structure in the acoustic space. Previous approaches to dimensionality reduction based on manifold learning assumed that the acoustic space can be characterized by a uniform manifold structure. In this paper we relax this assumption by learning different manifold structures for broad phonetic classes. Because all known classifiers make confusions between broad classes, we designed a two-level classifier in which the top level consists of a number of partially overlapping broad classes. Since the resulting classifiers are not statistically independent, we propose a new method for fusing the classifiers. Experimental results show that our two-level classifier obtained slightly better results when broad-class specific manifolds were learned, compared to a uniform manifold. However, the accuracy is still considerably lower than what could be obtained with oracle knowledge about broad class membership. From this we infer that phones do not form compact clusters in acoustic space.

Original languageEnglish
Pages (from-to)28-45
Number of pages18
JournalComputer Speech and Language
Volume38
DOIs
Publication statusPublished - 1 Jul 2016

Scopus Subject Areas

  • Software
  • Theoretical Computer Science
  • Human-Computer Interaction

User-Defined Keywords

  • Classifier fusion
  • Dimensionality reduction
  • Manifold learning
  • Partial classification
  • Phone classification
  • TIMIT

Fingerprint

Dive into the research topics of 'Locally learning heterogeneous manifolds for phonetic classification'. Together they form a unique fingerprint.

Cite this