TY - JOUR
T1 - More Is Less
T2 - Model Augmentation for Intermittent Deep Inference
AU - Kang, Chih Kai
AU - Mendis, Hashan Roshantha
AU - Lin, Chun Han
AU - Chen, Ming Syan
AU - Hsiu, Pi Cheng
N1 - Publisher Copyright:
© 2022 Association for Computing Machinery.
PY - 2022/10/8
Y1 - 2022/10/8
N2 - Energy harvesting creates an emerging intermittent computing paradigm but poses new challenges for sophisticated applications such as intermittent deep neural network (DNN) inference. Although model compression has adapted DNNs to resource-constrained devices, under intermittent power, compressed models will still experience multiple power failures during a single inference. Footprint-based approaches enable hardware-accelerated intermittent DNN inference by tracking footprints, independent of model computations, to indicate accelerator progress across power cycles. However, we observe that the extra overhead required to preserve progress indicators can severely offset the computation progress accumulated by intermittent DNN inference. This work proposes the concept of model augmentation to adapt DNNs to intermittent devices. Our middleware stack, JAPARI, appends extra neural network components into a given DNN, to enable the accelerator to intrinsically integrate progress indicators into the inference process, without affecting model accuracy. Their specific positions allow progress indicator preservation to be piggybacked onto output feature preservation to amortize the extra overhead, and their assigned values ensure uniquely distinguishable progress indicators for correct inference recovery upon power resumption. Evaluations on a Texas Instruments device under various DNN models, capacitor sizes, and progress preservation granularities show that JAPARI can speed up intermittent DNN inference by 3× over the state of the art, for common convolutional neural architectures that require heavy acceleration.
AB - Energy harvesting creates an emerging intermittent computing paradigm but poses new challenges for sophisticated applications such as intermittent deep neural network (DNN) inference. Although model compression has adapted DNNs to resource-constrained devices, under intermittent power, compressed models will still experience multiple power failures during a single inference. Footprint-based approaches enable hardware-accelerated intermittent DNN inference by tracking footprints, independent of model computations, to indicate accelerator progress across power cycles. However, we observe that the extra overhead required to preserve progress indicators can severely offset the computation progress accumulated by intermittent DNN inference. This work proposes the concept of model augmentation to adapt DNNs to intermittent devices. Our middleware stack, JAPARI, appends extra neural network components into a given DNN, to enable the accelerator to intrinsically integrate progress indicators into the inference process, without affecting model accuracy. Their specific positions allow progress indicator preservation to be piggybacked onto output feature preservation to amortize the extra overhead, and their assigned values ensure uniquely distinguishable progress indicators for correct inference recovery upon power resumption. Evaluations on a Texas Instruments device under various DNN models, capacitor sizes, and progress preservation granularities show that JAPARI can speed up intermittent DNN inference by 3× over the state of the art, for common convolutional neural architectures that require heavy acceleration.
KW - Deep neural networks
KW - edge computing
KW - energy harvesting
KW - intermittent systems
KW - model adaptation
UR - http://www.scopus.com/inward/record.url?scp=85144136468&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144136468&partnerID=8YFLogxK
U2 - 10.1145/3506732
DO - 10.1145/3506732
M3 - Article
AN - SCOPUS:85144136468
SN - 1539-9087
VL - 21
JO - ACM Transactions on Embedded Computing Systems
JF - ACM Transactions on Embedded Computing Systems
IS - 5
M1 - 3506732
ER -