TY - GEN
T1 - HeteroEdge
T2 - 20th IEEE International Conference on Mobile Ad Hoc and Smart Systems, MASS 2023
AU - Anwar, Mohammad Saeid
AU - Dey, Emon
AU - Devnath, Maloy Kumar
AU - Ghosh, Indrajeet
AU - Khan, Naima
AU - Freeman, Jade
AU - Gregory, Timothy
AU - Suri, Niranjan
AU - Jayarajah, Kasthuri
AU - Ramamurthy, Sreenivasan Ramasamy
AU - Roy, Nirmalya
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Gathering knowledge about surroundings and generating situation awareness for autonomous systems is of utmost importance for systems developed for smart urban and uncontested environments. For example, a large area surveillance system is typically equipped with multi-modal sensors such as cameras and LIDARs and is required to execute deep learning algorithms for action, face, behavior, and object recognition. However, these systems are subjected to power and memory limitations due to their ubiquitous nature. As a result, optimizing how the sensed data is processed, fed to the deep learning algorithms, and the model inferences are communicated is critical. In this paper, we consider a testbed comprising two Unmanned Ground Vehicles (UGVs) and two NVIDIA Jetson devices and posit a self-adaptive optimization framework that is capable of navigating the workload of multiple tasks (storage, processing, computation, transmission, inference) collaboratively on multiple heterogenous nodes for multiple tasks simultaneously. The self-adaptive optimization framework involves compressing and masking the input image frames, identifying similar frames, and profiling the devices for various tasks to obtain the boundary conditions for the optimization framework. Finally, we propose and optimize a novel parameter split-ratio, which indicates the proportion of the data required to be offloaded to another device while considering the networking bandwidth, busy factor, memory (CPU, GPU, RAM), and power constraints of the devices in the testbed. Our evaluations captured while executing multiple tasks (e.g., PoseNet, SegNet, ImageNet, DetectNet, DepthNet) simultaneously, reveal that executing 70% (split-ratio=70%) of the data on the auxiliary node minimizes the offloading latency by ≈ 33% (18.7 ms/image to 12.5 ms/image) and the total operation time by ≈ 47% (69.32s to 36.43s) compared to the baseline configuration (executing on the primary node).
AB - Gathering knowledge about surroundings and generating situation awareness for autonomous systems is of utmost importance for systems developed for smart urban and uncontested environments. For example, a large area surveillance system is typically equipped with multi-modal sensors such as cameras and LIDARs and is required to execute deep learning algorithms for action, face, behavior, and object recognition. However, these systems are subjected to power and memory limitations due to their ubiquitous nature. As a result, optimizing how the sensed data is processed, fed to the deep learning algorithms, and the model inferences are communicated is critical. In this paper, we consider a testbed comprising two Unmanned Ground Vehicles (UGVs) and two NVIDIA Jetson devices and posit a self-adaptive optimization framework that is capable of navigating the workload of multiple tasks (storage, processing, computation, transmission, inference) collaboratively on multiple heterogenous nodes for multiple tasks simultaneously. The self-adaptive optimization framework involves compressing and masking the input image frames, identifying similar frames, and profiling the devices for various tasks to obtain the boundary conditions for the optimization framework. Finally, we propose and optimize a novel parameter split-ratio, which indicates the proportion of the data required to be offloaded to another device while considering the networking bandwidth, busy factor, memory (CPU, GPU, RAM), and power constraints of the devices in the testbed. Our evaluations captured while executing multiple tasks (e.g., PoseNet, SegNet, ImageNet, DetectNet, DepthNet) simultaneously, reveal that executing 70% (split-ratio=70%) of the data on the auxiliary node minimizes the offloading latency by ≈ 33% (18.7 ms/image to 12.5 ms/image) and the total operation time by ≈ 47% (69.32s to 36.43s) compared to the baseline configuration (executing on the primary node).
KW - Autonomous Systems
KW - Collaborative Systems
KW - Deep Edge Intelligence
UR - http://www.scopus.com/inward/record.url?scp=85178509179&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85178509179&partnerID=8YFLogxK
U2 - 10.1109/MASS58611.2023.00077
DO - 10.1109/MASS58611.2023.00077
M3 - Conference contribution
AN - SCOPUS:85178509179
T3 - Proceedings - 2023 IEEE 20th International Conference on Mobile Ad Hoc and Smart Systems, MASS 2023
SP - 575
EP - 583
BT - Proceedings - 2023 IEEE 20th International Conference on Mobile Ad Hoc and Smart Systems, MASS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 September 2023 through 27 September 2023
ER -