Dystri: A Dynamic Inference based Distributed DNN Service Framework on Edge

Xueyu Hou, Yongjie Guan, Tao Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep neural network (DNN) inference poses unique challenges in serving computational requests due to high request intensity, concurrent multi-user scenarios, and diverse heterogeneous service types. Simultaneously, mobile and edge devices provide users with enhanced computational capabilities, enabling them to utilize local resources for deep inference processing. Moreover, dynamic inference techniques allow content-based computational cost selection per request. This paper presents Dystri, an innovative framework devised to facilitate dynamic inference on distributed edge infrastructure, thereby accommodating multiple heterogeneous users. Dystri offers a broad applicability in practical environments, encompassing heterogeneous device types, DNN-based applications, and dynamic inference techniques, surpassing the state-of-the-art (SOTA) approaches. With distributed controllers and a global coordinator, Dystri allows per-request, per-user adjustments of quality-of-service, ensuring instantaneous, flexible, and discrete control. The decoupled workflows in Dystri naturally support user heterogeneity and scalability, addressing crucial aspects overlooked by existing SOTA works. Our evaluation involves three multi-user, heterogeneous DNN inference service platforms deployed on distributed edge infrastructure, encompassing seven DNN applications. Results show Dystri achieves near-zero deadline misses and excels in adapting to varying user numbers and request intensities. Dystri outperforms baselines with accuracy improvement up to 95×.

Original languageEnglish (US)
Title of host publication52nd International Conference on Parallel Processing, ICPP 2023 - Main Conference Proceedings
PublisherAssociation for Computing Machinery
Pages625-634
Number of pages10
ISBN (Electronic)9798400708435
DOIs
StatePublished - Aug 7 2023
Event52nd International Conference on Parallel Processing, ICPP 2023 - Salt Lake City, United States
Duration: Aug 7 2023Aug 10 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference52nd International Conference on Parallel Processing, ICPP 2023
Country/TerritoryUnited States
CitySalt Lake City
Period8/7/238/10/23

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Keywords

  • MLaaS
  • dynamic inference
  • edge computing

Fingerprint

Dive into the research topics of 'Dystri: A Dynamic Inference based Distributed DNN Service Framework on Edge'. Together they form a unique fingerprint.

Cite this