TY - JOUR
T1 - Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing
AU - QI, Jun
AU - Yang, Chao-Han Huck
AU - Chen, Pin-Yu
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2023/1/4
Y1 - 2023/1/4
N2 - This work focuses on designing low-complexity hybrid tensor networks by considering trade-offs between the model complexity and practical performance. Firstly, we exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN. Secondly, a hybrid model combining LR-TT-DNN with a convolutional neural network (CNN), which is denoted as CNN+(LR-TT-DNN), is set up to boost the performance. Instead of randomly assigning large TT-ranks for TT-DNN, we leverage Riemannian gradient descent to determine a TT-DNN associated with small TT-ranks. Furthermore, CNN+(LR-TT-DNN) consists of convolutional layers at the bottom for feature extraction and several TT layers at the top to solve regression and classification problems. We separately assess the LR-TT-DNN and CNN+(LR-TT-DNN) models on speech enhancement and spoken command recognition tasks. Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(TT-DNN) counterparts.
AB - This work focuses on designing low-complexity hybrid tensor networks by considering trade-offs between the model complexity and practical performance. Firstly, we exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN. Secondly, a hybrid model combining LR-TT-DNN with a convolutional neural network (CNN), which is denoted as CNN+(LR-TT-DNN), is set up to boost the performance. Instead of randomly assigning large TT-ranks for TT-DNN, we leverage Riemannian gradient descent to determine a TT-DNN associated with small TT-ranks. Furthermore, CNN+(LR-TT-DNN) consists of convolutional layers at the bottom for feature extraction and several TT layers at the top to solve regression and classification problems. We separately assess the LR-TT-DNN and CNN+(LR-TT-DNN) models on speech enhancement and spoken command recognition tasks. Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(TT-DNN) counterparts.
KW - Tensor-train network
KW - speech enhancement
KW - spoken command recognition
KW - Riemannian gradient descent
KW - low-rank tensor-train decomposition
KW - tensor-train deep neural network
UR - http://www.scopus.com/inward/record.url?scp=85147143215&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2022.3231714
DO - 10.1109/TASLP.2022.3231714
M3 - Journal article
SN - 2329-9290
VL - 31
SP - 633
EP - 642
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
ER -