Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing

A. Aydin Alatan, Ali N. Akansu, Wayne Wolf

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

A class of audio-visual content is segmented into dialogue scenes using the state transitions of a novel hidden Markov model (HMM). Each shot is classified using both audio track and visual content to determine the state/scene transitions of the model. After simulations with circular and left-to-right HMM topologies, it is observed that both are performing very good with multi-modal inputs. Moreover, for circular topology, the comparisons between different training and observation sets show that audio and face information together gives the most consistent results among different observation sets.

Original languageEnglish (US)
Title of host publicationImage and Multidimensional Signal ProcessingMultimedia Signal Processing
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2401-2404
Number of pages4
ISBN (Electronic)0780362934
DOIs
StatePublished - 2000
Event25th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2000 - Istanbul, Turkey
Duration: Jun 5 2000Jun 9 2000

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume4
ISSN (Print)1520-6149

Other

Other25th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2000
Country/TerritoryTurkey
CityIstanbul
Period6/5/006/9/00

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing'. Together they form a unique fingerprint.

Cite this