Automatic Music Transcription Leveraging Generalized Cepstral Features and Deep Learning

Yu Te Wu, Berlin Chen, Li Su

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Spectral features are limited in modeling musical signals with multiple concurrent pitches due to the challenge to suppress the interference of the harmonic peaks from one pitch to another. In this paper, we show that using multiple features represented in both the frequency and time domains with deep learning modeling can reduce such interference. These features are derived systematically from conventional pitch detection functions that relate to one another through the discrete Fourier transform and a nonlinear scaling function. Neural networks modeled with these features outperform state-of-the-art methods while using less training data.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages401-405
Number of pages5
ISBN (Print)9781538646588
DOIs
Publication statusPublished - 2018 Sep 10
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: 2018 Apr 152018 Apr 20

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2018-April
ISSN (Print)1520-6149

Conference

Conference2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
CountryCanada
CityCalgary
Period18/4/1518/4/20

Keywords

  • Automatic music transcription
  • Cepstrum
  • Convolutional neural networks
  • Deep learning

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Automatic Music Transcription Leveraging Generalized Cepstral Features and Deep Learning'. Together they form a unique fingerprint.

  • Cite this

    Wu, Y. T., Chen, B., & Su, L. (2018). Automatic Music Transcription Leveraging Generalized Cepstral Features and Deep Learning. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings (pp. 401-405). [8462079] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2018-April). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2018.8462079