CaptainCook4D: A Dataset for Understanding Errors in Procedural Activities

  • Rohith Peddi
  • , Shivvrat Arya
  • , Bharath Challa
  • , Likhitha Pallapothula
  • , Akshay Vyas
  • , Bhavya Gouripeddi
  • , Qifan Zhang
  • , Jikai Wang
  • , Vasundhara Komaragiri
  • , Eric Ragan
  • , Nicholas Ruozzi
  • , Yu Xiang
  • , Vibhav Gogate

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

Following step-by-step procedures is an essential component of various activities carried out by individuals in their daily lives. These procedures serve as a guiding framework that helps to achieve goals efficiently, whether it is assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understanding such procedural activities from a sequence of frames is a challenging task that demands an accurate interpretation of visual information and the ability to reason about the structure of the activity. To this end, we collect a new egocentric 4D dataset CaptainCook4D comprising 384 recordings (94.5 hours) of people performing recipes in real kitchen environments. This dataset consists of two distinct types of activities: one in which participants adhere to the provided recipe instructions and another in which they deviate and induce errors. We provide 5.3K step annotations and 10K fine-grained action annotations and benchmark the dataset for the following tasks: error recognition, multi-step localization and procedure learning.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
Volume37
StatePublished - 2024
Externally publishedYes
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada
Duration: Dec 9 2024Dec 15 2024

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Information Systems
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'CaptainCook4D: A Dataset for Understanding Errors in Procedural Activities'. Together they form a unique fingerprint.

Cite this