Abstract
In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction. The A-DCTNet applies the idea of constant-Q transform, with its center frequencies of filterbanks geometrically spaced. The A-DCTNet is adaptive to different acoustic scales, and it can better capture low frequency acoustic information that is sensitive to human audio perception than features such as Mel-frequency spectral coefficients (MFSC). We use features extracted by the A-DCTNet as input for classifiers. Experimental results show that the A-DCTNet and Recurrent Neural Networks (RNN) achieve state-of-the-art performance in bird song classification rate, and improve artist identification accuracy in music data. They demonstrate A-DCTNet's applicability to signal processing problems.
Original language | English |
---|---|
Title of host publication | 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Publisher | IEEE |
Pages | 3999-4003 |
Number of pages | 5 |
ISBN (Electronic) | 9781509041176, 9781509041169 |
ISBN (Print) | 9781509041183 |
DOIs | |
Publication status | Published - Mar 2017 |
Event | 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017 - New Orleans, LA, United States Duration: 5 Mar 2017 → 9 Mar 2017 https://ieeexplore.ieee.org/xpl/conhome/7943262/proceeding |
Publication series
Name | International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |
---|---|
ISSN (Electronic) | 2379-190X |
Conference
Conference | 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017 |
---|---|
Country/Territory | United States |
City | New Orleans, LA |
Period | 5/03/17 → 9/03/17 |
Internet address |
User-Defined Keywords
- Adaptive DCTNet
- audio signals
- time-frequency analysis
- RNN
- feature extraction