TY - GEN
T1 - Local summarization and multi-level LSH for retrieving multi-variant audio tracks
AU - Yu, Yi
AU - Crucianu, Michel
AU - Oria, Vincent
AU - Chen, Lei
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2009
Y1 - 2009
N2 - In this paper we study the problem of detecting and grouping multi-variant audio tracks in large audio datasets. To address this issue, a fast and reliable retrieval method is necessary. But reliability requires elaborate representations of audio content, which challenges fast retrieval by similarity from a large audio database. To find a better tradeoff between retrieval quality and efficiency, we put forward an approach relying on local summarization and multi-level Locality-Sensitive Hashing (LSH). More precisely, each audio track is divided into multiple Continuously Correlated Periods (CCP) of variable length according to spectral similarity. The description for each CCP is calculated based on its Weighted Mean Chroma (WMC). A track is thus represented as a sequence of WMCs. Then, an adapted two-level LSH is employed for efficiently delineating a narrow relevant search region. The "coarse" hashing level restricts search to items having a non-negligible similarity to the query. The subsequent, "refined" level only returns items showing a much higher similarity. Experimental evaluations performed on a real multi-variant audio dataset confirm that our approach supports fast and reliable retrieval of audio track variants.
AB - In this paper we study the problem of detecting and grouping multi-variant audio tracks in large audio datasets. To address this issue, a fast and reliable retrieval method is necessary. But reliability requires elaborate representations of audio content, which challenges fast retrieval by similarity from a large audio database. To find a better tradeoff between retrieval quality and efficiency, we put forward an approach relying on local summarization and multi-level Locality-Sensitive Hashing (LSH). More precisely, each audio track is divided into multiple Continuously Correlated Periods (CCP) of variable length according to spectral similarity. The description for each CCP is calculated based on its Weighted Mean Chroma (WMC). A track is thus represented as a sequence of WMCs. Then, an adapted two-level LSH is employed for efficiently delineating a narrow relevant search region. The "coarse" hashing level restricts search to items having a non-negligible similarity to the query. The subsequent, "refined" level only returns items showing a much higher similarity. Experimental evaluations performed on a real multi-variant audio dataset confirm that our approach supports fast and reliable retrieval of audio track variants.
KW - Local audio summarization
KW - Multi-level LSH
KW - Multi-variant musical audio search
UR - http://www.scopus.com/inward/record.url?scp=72449178943&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72449178943&partnerID=8YFLogxK
U2 - 10.1145/1631272.1631320
DO - 10.1145/1631272.1631320
M3 - Conference contribution
AN - SCOPUS:72449178943
SN - 9781605586083
T3 - MM'09 - Proceedings of the 2009 ACM Multimedia Conference, with Co-located Workshops and Symposiums
SP - 341
EP - 350
BT - MM'09 - Proceedings of the 2009 ACM Multimedia Conference, with Co-located Workshops and Symposiums
T2 - 17th ACM International Conference on Multimedia, MM'09, with Co-located Workshops and Symposiums
Y2 - 19 October 2009 through 24 October 2009
ER -