TY - GEN
T1 - Cooperative Deep Reinforcement Learning for Multiple-group NB-IoT Networks Optimization
AU - Jiang, Nan
AU - Deng, Yansha
AU - Simeone, Osvaldo
AU - Nallanathan, Arumugam
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - NarrowBand-Internet of Things (NB-IoT) is an emerging cellular-based technology that offers a range of flexible configurations for massive IoT radio access from groups of devices with heterogeneous requirements. A configuration specifies the amount of radio resources allocated to each group of devices for random access and for data transmission. Assuming no knowledge of the traffic statistics, the problem is to determine, in an online fashion at each Transmission Time Interval (TTI), the configurations that maximizes the long-term average number of IoT devices that are able to both access and deliver data. Given the complexity of optimal algorithms, a Cooperative Multi-Agent Deep Neural Network based Q-learning (CMA-DQN) approach is developed, whereby each DQN agent independently control a configuration variable for each group. The DQN agents are cooperatively trained in the same environment based on feedback regarding transmission outcomes. CMA-DQN is seen to considerably outperform conventional heuristic approaches based on load estimation.
AB - NarrowBand-Internet of Things (NB-IoT) is an emerging cellular-based technology that offers a range of flexible configurations for massive IoT radio access from groups of devices with heterogeneous requirements. A configuration specifies the amount of radio resources allocated to each group of devices for random access and for data transmission. Assuming no knowledge of the traffic statistics, the problem is to determine, in an online fashion at each Transmission Time Interval (TTI), the configurations that maximizes the long-term average number of IoT devices that are able to both access and deliver data. Given the complexity of optimal algorithms, a Cooperative Multi-Agent Deep Neural Network based Q-learning (CMA-DQN) approach is developed, whereby each DQN agent independently control a configuration variable for each group. The DQN agents are cooperatively trained in the same environment based on feedback regarding transmission outcomes. CMA-DQN is seen to considerably outperform conventional heuristic approaches based on load estimation.
KW - Deep Reinforcement Learning
KW - Multi-Agent
KW - NB-IoT
KW - Random Access
KW - Resource Configuration
UR - http://www.scopus.com/inward/record.url?scp=85068956808&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068956808&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8682697
DO - 10.1109/ICASSP.2019.8682697
M3 - Conference contribution
AN - SCOPUS:85068956808
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 8424
EP - 8428
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Y2 - 12 May 2019 through 17 May 2019
ER -