The multi-instrument transcription task refers to joint recognition of instrument and pitch of every event in polyphonic music signals generated by one or more classes of music instruments. In this paper, we leverage multi-object semantic segmentation techniques to solve this problem. We design a time-frequency representation, which has multiple channels to jointly represent the harmonic structure and pitch saliency of a pitch activation. The transcription task therefore becomes a pixel-wise multi-task classification problem including pitch activity detection and instrument recognition. Experiments on both single- and multi-instrument data verify the competitiveness of the proposed method.
|Title of host publication||2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||5|
|Publication status||Published - 2019 May|
|Event||44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom|
Duration: 2019 May 12 → 2019 May 17
|Name||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|Conference||44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019|
|Period||19/5/12 → 19/5/17|
- Automatic music transcription
- multipitch estimation
- semantic segmentation.
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering