Towards real-time embodied AI agent: a bionic visual encoding framework for mobile robotics: Towards real-time embodied AI agent: a bionic visual..: X. Hou et al.

Xueyu Hou, Yongjie Guan, Tao Han, Cong Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Embodied artificial intelligence (AI) agents, which navigate and interact with their environment using sensors and actuators, are being applied for mobile robotic platforms with limited computing power, such as autonomous vehicles, drones, and humanoid robots. These systems make decisions through environmental perception from deep neural network (DNN)-based visual encoders. However, the constrained computational resources and the large amounts of visual data to be processed can create bottlenecks, such as taking almost 300 milliseconds per decision on an embedded GPU board (Jetson Xavier). Existing DNN acceleration methods need model retraining and can still reduce accuracy. To address these challenges, our paper introduces a bionic visual encoder framework, }Robye, to support real-time requirements of embodied AI agents. The proposed framework complements existing DNN acceleration techniques. Specifically, we integrate motion data to identify overlapping areas between consecutive frames, which reduces DNN workload by propagating encoding results. We bifurcate processing into high-resolution for task-critical areas and low-resolution for less-significant regions. This dual-resolution approach allows us to maintain task performance while lowering the overall computational demands. We evaluate }Robye across three robotic scenarios: autonomous driving, vision-and-language navigation, and drone navigation, using various DNN models and mobile platforms. }Robye outperforms baselines in speed (1.2–3.3 ×), performance (+4% to +29%), and power consumption (-36% to -47%).

Original languageEnglish (US)
Article number104184
Pages (from-to)1038-1056
Number of pages19
JournalInternational Journal of Intelligent Robotics and Applications
Volume8
Issue number4
DOIs
StatePublished - Dec 2024

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Artificial Intelligence

Keywords

  • Computer vision
  • Embodied AI
  • Mobile robotics
  • Visual encoding

Fingerprint

Dive into the research topics of 'Towards real-time embodied AI agent: a bionic visual encoding framework for mobile robotics: Towards real-time embodied AI agent: a bionic visual..: X. Hou et al.'. Together they form a unique fingerprint.

Cite this