TY - GEN
T1 - Active Learning for Efficient Audio Annotation and Classification with a Large Amount of Unlabeled Data
AU - Wang, Yu
AU - Mendez Mendez, Ana Elisa
AU - Cartwright, Mark
AU - Bello, Juan Pablo
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - There are many sound classification problems that have target classes which are rare or unique to the context of the problem. For these problems, existing data sets are not sufficient and we must create new problem-specific datasets to train classification models. However, annotating a new dataset for every new problem is costly. Active learning could potentially reduce this annotation cost, but it has been understudied in the context of audio annotation. In this work, we investigate active learning to reduce the annotation cost of a sound classification dataset unique to a particular problem. We evaluate three certainty-based active learning query strategies and propose a new strategy: alternating confidence sampling. Using this strategy, we demonstrate reduced annotation costs when actively training models with both experts and non-experts, and we perform a qualitative analysis on 20k unlabeled recordings to show our approach results in a model that generalizes well to unseen data.
AB - There are many sound classification problems that have target classes which are rare or unique to the context of the problem. For these problems, existing data sets are not sufficient and we must create new problem-specific datasets to train classification models. However, annotating a new dataset for every new problem is costly. Active learning could potentially reduce this annotation cost, but it has been understudied in the context of audio annotation. In this work, we investigate active learning to reduce the annotation cost of a sound classification dataset unique to a particular problem. We evaluate three certainty-based active learning query strategies and propose a new strategy: alternating confidence sampling. Using this strategy, we demonstrate reduced annotation costs when actively training models with both experts and non-experts, and we perform a qualitative analysis on 20k unlabeled recordings to show our approach results in a model that generalizes well to unseen data.
KW - active learning
KW - audio annotations
KW - machine listening
KW - sound classification
UR - http://www.scopus.com/inward/record.url?scp=85068984170&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068984170&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8683063
DO - 10.1109/ICASSP.2019.8683063
M3 - Conference contribution
AN - SCOPUS:85068984170
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 880
EP - 884
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Y2 - 12 May 2019 through 17 May 2019
ER -