Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing

Jun QI*, Chao-Han Huck Yang, Pin-Yu Chen

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

8 Citations (Scopus)

Abstract

This work focuses on designing low-complexity hybrid tensor networks by considering trade-offs between the model complexity and practical performance. Firstly, we exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN. Secondly, a hybrid model combining LR-TT-DNN with a convolutional neural network (CNN), which is denoted as CNN+(LR-TT-DNN), is set up to boost the performance. Instead of randomly assigning large TT-ranks for TT-DNN, we leverage Riemannian gradient descent to determine a TT-DNN associated with small TT-ranks. Furthermore, CNN+(LR-TT-DNN) consists of convolutional layers at the bottom for feature extraction and several TT layers at the top to solve regression and classification problems. We separately assess the LR-TT-DNN and CNN+(LR-TT-DNN) models on speech enhancement and spoken command recognition tasks. Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(TT-DNN) counterparts.
Original languageEnglish
Pages (from-to)633 - 642
Number of pages10
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume31
DOIs
Publication statusPublished - 4 Jan 2023

Scopus Subject Areas

  • Computer Science(all)

User-Defined Keywords

  • Tensor-train network
  • speech enhancement
  • spoken command recognition
  • Riemannian gradient descent
  • low-rank tensor-train decomposition
  • tensor-train deep neural network

Fingerprint

Dive into the research topics of 'Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing'. Together they form a unique fingerprint.

Cite this