Abstract
Mobile edge computing (MEC) is an evolving paradigm for rendering services through network-accessible resources deployed over Internet of Things (IoT) nodes at the edge. Nevertheless, an MEC environment usually employs thousands of physical machines connected via hundreds of switches/routers that communicate and coordinate to deliver computing service. In such complicated systems, faults caused by software, human errors, and hardware are often unavoidable. The edge of network presents a dynamic environment with great quantities of terminals, high mobility of mobile devices, heterogeneous applications, and intermittent traffic. In such an environment, MEC can suffer from unbalanced resource provisioning and interruptions of faults occurring at different levels, which further causes task faults and affects service quality. To address this challenge, this work proposes a novel fault-tolerant offloading method for handling faults by leveraging a reinforcement-learning-based service offloading decision model. The model synthesizes a Dueling Deep Q Network (DQN)-based algorithm for deciding user offloading behaviors and an adaptive checkpointing method for improving task execution reliability. For the purpose of model validation and comparison, extensive simulations are conducted. Numerical results clearly demonstrate that the proposed method is highly effective and outperforms existing methods.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 17022-17033 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Automation Science and Engineering |
| Volume | 22 |
| DOIs | |
| State | Published - 2025 |
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Electrical and Electronic Engineering
Keywords
- Mobile edge computing
- dueling Deep Q Network (DQN)
- fault tolerance
- reinforcement learning
- service offloading