TY - JOUR
T1 - On the impact of deep neural network calibration on adaptive edge offloading for image classification
AU - Pacheco, Roberto G.
AU - Couto, Rodrigo S.
AU - Simeone, Osvaldo
N1 - Funding Information:
Osvaldo Simeone (Fellow, IEEE) received the M.Sc. (with Hons.) degree and the Ph.D. degree in information engineering from Politecnico di Milano, Milan, Italy, in 2001 and 2005, respectively. He is currently a Professor of information engineering with the Centre for Telecommunications Research, Department of Engineering, King’s College London, London, U.K., where he directs the King’s Communications, Learning and Information Processing Lab. From 2006 to 2017, he was a Faculty Member with the Electrical and Computer Engineering Department, New Jersey Institute of Technology, Newark, NJ, USA, where he was affiliated with the Center for Wireless Information Processing. He has coauthored of two monographs, two edited books published by Cambridge University Press, and more than one hundred research journal papers. His research interests include information theory, machine learning, wireless communications, and neuromorphic computing. He is currently on the Editorial Board of the IEEE SIGNAL PROCESSING MAGAZINE and the Chair of the Signal Processing for Communications and Networking Technical Committee of the IEEE Signal Processing Society. He was a Distinguished Lecturer of the IEEE Information Theory Society, in 2017 and 2018, and he is currently a Distinguished Lecturer of the IEEE Communications Society. He was the co-recipient of the 2019 IEEE Communication Society Best Tutorial Paper Award, the 2018 IEEE Signal Processing Best Paper Award, the 2017 JCN Best Paper Award, the 2015 IEEE Communication Society Best Tutorial Paper Award, and the Best Paper Awards of IEEE SPAWC 2007 and IEEE WRECOM 2007. He was also recipient of the Consolidator grant by the European Research Council in 2016. His research has been supported by the U.S. NSF, the ERC, the Vienna Science and Technology Fund, and a number of industrial collaborations. He is a Fellow of the IET.
Funding Information:
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001 . It was also supported by CNPq , PR2/UFRJ , FAPERJ Grants E-26/203.211/2017 , E-26/010.002174/2019 , and E-26/201.300/2021 , and FAPESP Grant 15/24494-8 . The work of O. Simeone was supported by the European Research Council (ERC) through European Union’s Horizon 2020 Research and Innovation Programme under Grant 725731 , by an Open Fellowship of the EPSRC with reference EP/W024101/1, by the European Union’s Horizon Europe project CENTRIC (101096379), and by Project REASON, a UK Government funded project under the Future Open Networks Research Challenge (FONRC) sponsored by the Department of Science Innovation and Technology (DSIT).
Funding Information:
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001. It was also supported by CNPq, PR2/UFRJ, FAPERJ Grants E-26/203.211/2017, E-26/010.002174/2019, and E-26/201.300/2021, and FAPESP Grant 15/24494-8. The work of O. Simeone was supported by the European Research Council (ERC) through European Union's Horizon 2020 Research and Innovation Programme under Grant 725731, by an Open Fellowship of the EPSRC with reference EP/W024101/1, by the European Union's Horizon Europe project CENTRIC (101096379), and by Project REASON, a UK Government funded project under the Future Open Networks Research Challenge (FONRC) sponsored by the Department of Science Innovation and Technology (DSIT).
Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/8
Y1 - 2023/8
N2 - Edge devices can offload deep neural network (DNN) inference to the cloud to overcome energy or processing constraints. Nevertheless, offloading adds communication delay, which increases the overall inference time. An alternative is to use adaptive offloading based on early-exit DNNs. Early-exit DNNs have branches inserted at the output of given intermediate layers. These side branches provide confidence estimates. If the confidence level of the decision produced is sufficient, the inference is made by the side branch. Otherwise, the edge offloads the inference decision to the cloud, which implements the remaining DNN layers. Thus, the offloading decision depends on reliable confidence levels provided by the side branches at the device. This article provides an extensive calibration study on different datasets and early-exit DNNs for the image classification task. Our study shows that early-exit DNNs are often miscalibrated, overestimating their prediction confidence and making unreliable offloading decisions. To evaluate the impact of calibration on accuracy and latency, we introduce two novel application-level metrics and evaluate well-known DNN models in a realistic edge computing scenario. The results demonstrated that calibrating early-exit DNNs improves the probabilities of meeting accuracy and latency requirements.
AB - Edge devices can offload deep neural network (DNN) inference to the cloud to overcome energy or processing constraints. Nevertheless, offloading adds communication delay, which increases the overall inference time. An alternative is to use adaptive offloading based on early-exit DNNs. Early-exit DNNs have branches inserted at the output of given intermediate layers. These side branches provide confidence estimates. If the confidence level of the decision produced is sufficient, the inference is made by the side branch. Otherwise, the edge offloads the inference decision to the cloud, which implements the remaining DNN layers. Thus, the offloading decision depends on reliable confidence levels provided by the side branches at the device. This article provides an extensive calibration study on different datasets and early-exit DNNs for the image classification task. Our study shows that early-exit DNNs are often miscalibrated, overestimating their prediction confidence and making unreliable offloading decisions. To evaluate the impact of calibration on accuracy and latency, we introduce two novel application-level metrics and evaluate well-known DNN models in a realistic edge computing scenario. The results demonstrated that calibrating early-exit DNNs improves the probabilities of meeting accuracy and latency requirements.
KW - Deep neural network calibration
KW - Early-exit deep neural networks
KW - Edge computing
KW - Edge offloading
UR - http://www.scopus.com/inward/record.url?scp=85162180786&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85162180786&partnerID=8YFLogxK
U2 - 10.1016/j.jnca.2023.103679
DO - 10.1016/j.jnca.2023.103679
M3 - Article
AN - SCOPUS:85162180786
SN - 1084-8045
VL - 217
JO - Journal of Network and Computer Applications
JF - Journal of Network and Computer Applications
M1 - 103679
ER -