NeuLens: Spatial-based Dynamic Acceleration of Convolutional Neural Networks on Edge

Xueyu Hou, Yongjie Guan, Tao Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Convolutional neural networks (CNNs) play an important role in today's mobile and edge computing systems for vision-based tasks like object classification and detection. However, state-of-The-Art methods on CNN acceleration are trapped in either limited practical latency speed-up on general computing platforms or latency speed-up with severe accuracy loss. In this paper, we propose a spatial-based dynamic CNN acceleration framework, NeuLens, for mobile and edge platforms. Specially, we design a novel dynamic inference mechanism, assemble region-Aware convolution (ARAC) supernet, that peels off redundant operations inside CNN models as many as possible based on spatial redundancy and channel slicing. In ARAC supernet, the CNN inference flow is split into multiple independent micro-flows, and the computational cost of each can be autonomously adjusted based on its tiled-input content and application requirements. These micro-flows can be loaded into hardware like GPUs as single models. Consequently, its operation reduction can be well translated into latency speed-up and is compatible with hardware-level accelerations. Moreover, the inference accuracy can be well preserved by identifying critical regions on images and processing them in the original resolution with large micro-flow. Based on our evaluation, NeuLens outperforms baseline methods by up to 58% latency reduction with the same accuracy and by up to 67.9% accuracy improvement under the same latency/memory constraints.

Original languageEnglish (US)
Title of host publicationACM MobiCom 2022 - Proceedings of the 2022 28th Annual International Conference on Mobile Computing and Networking
PublisherAssociation for Computing Machinery
Pages186-199
Number of pages14
ISBN (Electronic)9781450391818
DOIs
StatePublished - Oct 14 2022
Event28th ACM Annual International Conference on Mobile Computing and Networking, MobiCom 2022 - Sydney, Australia
Duration: Oct 17 2202Oct 21 2202

Publication series

NameProceedings of the Annual International Conference on Mobile Computing and Networking, MOBICOM

Conference

Conference28th ACM Annual International Conference on Mobile Computing and Networking, MobiCom 2022
Country/TerritoryAustralia
CitySydney
Period10/17/0210/21/02

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Keywords

  • convolutional neural networks
  • dynamic inference
  • edge computing

Fingerprint

Dive into the research topics of 'NeuLens: Spatial-based Dynamic Acceleration of Convolutional Neural Networks on Edge'. Together they form a unique fingerprint.

Cite this