FEW-SHOT DRUM TRANSCRIPTION IN POLYPHONIC MUSIC

Yu Wang, Justin Salamon, Mark Cartwright, Nicholas J. Bryan, Juan Pablo Bello

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Data-driven approaches to automatic drum transcription (ADT) are often limited to a predefined, small vocabulary of percussion instrument classes. Such models cannot recognize out-of-vocabulary classes nor are they able to adapt to finer-grained vocabularies. In this work, we address open vocabulary ADT by introducing few-shot learning to the task. We train a Prototypical Network on a synthetic dataset and evaluate the model on multiple real-world ADT datasets with polyphonic accompaniment. We show that, given just a handful of selected examples at inference time, we can match and in some cases outperform a state-of-the-art supervised ADT approach under a fixed vocabulary setting. At the same time, we show that our model can successfully generalize to finer-grained or extended vocabularies unseen during training, a scenario where supervised approaches cannot operate at all. We provide a detailed analysis of our experimental results, including a breakdown of performance by sound class and by polyphony.

Original languageEnglish (US)
Title of host publicationProceedings of the International Society for Music Information Retrieval Conference
PublisherInternational Society for Music Information Retrieval
Pages117-124
Number of pages8
StatePublished - 2020
Externally publishedYes

Publication series

NameProceedings of the International Society for Music Information Retrieval Conference
Volume2020
ISSN (Electronic)3006-3094

All Science Journal Classification (ASJC) codes

  • Music
  • Artificial Intelligence
  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'FEW-SHOT DRUM TRANSCRIPTION IN POLYPHONIC MUSIC'. Together they form a unique fingerprint.

Cite this