TY - GEN
T1 - Novelty measures as cues for temporal salience in audio similarity
AU - Cartwright, Mark
AU - Pardo, Bryan
PY - 2012
Y1 - 2012
N2 - Most algorithms for estimating audio similarity either completely disregard time or they treat each moment in time equally. However, many studies over the years have noted several factors that affect how much attention we give to certain sounds or parts of sounds (e.g. loudness, the attack, novelty). These findings suggest that some time segments of audio may be more salient than others when making similarity judgments. We believe that if we could estimate this information, we could improve audio similarity measures. This paper presents the results of a human subject study designed to test the hypothesis that sounds segments with high timbral change are more salient than segments with low timbral change. We then investigate whether we can use this information to improve two audio similarity measures: a "bag-of-frames" approach and a dynamic time warping approach.
AB - Most algorithms for estimating audio similarity either completely disregard time or they treat each moment in time equally. However, many studies over the years have noted several factors that affect how much attention we give to certain sounds or parts of sounds (e.g. loudness, the attack, novelty). These findings suggest that some time segments of audio may be more salient than others when making similarity judgments. We believe that if we could estimate this information, we could improve audio similarity measures. This paper presents the results of a human subject study designed to test the hypothesis that sounds segments with high timbral change are more salient than segments with low timbral change. We then investigate whether we can use this information to improve two audio similarity measures: a "bag-of-frames" approach and a dynamic time warping approach.
KW - Audio novelty measures
KW - Audio similarity
KW - Audio temporal salience
KW - Query-by-example
UR - http://www.scopus.com/inward/record.url?scp=84870572158&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84870572158&partnerID=8YFLogxK
U2 - 10.1145/2390848.2390862
DO - 10.1145/2390848.2390862
M3 - Conference contribution
AN - SCOPUS:84870572158
SN - 9781450315913
T3 - MIRUM 2012 - Proceedings of the 2nd International ACM Workshop on Music Information Retrieval with User-Centered and Multimodal Strategies, Co-located with ACM Multimedia 2012
SP - 51
EP - 56
BT - MIRUM 2012 - Proceedings of the 2nd International ACM Workshop on Music Information Retrieval with User-Centered and Multimodal Strategies, Co-located with ACM Multimedia 2012
T2 - 2nd International ACM Workshop on Music Information Retrieval with User-Centered and Multimodal Strategies, MIRUM 2012 - Co-located with ACM Multimedia 2012
Y2 - 2 November 2012 through 2 November 2012
ER -