TY - GEN
T1 - Constraint-Aware Deep Reinforcement Learning for End-to-End Resource Orchestration in Mobile Networks
AU - Liu, Qiang
AU - Choi, Nakjung
AU - Han, Tao
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Network slicing is a promising technology that allows mobile network operators to efficiently serve various emerging use cases in 5G. It is challenging to optimize the utilization of network infrastructures while guaranteeing the performance of network slices according to service level agreements (SLAs). To solve this problem, we propose SafeSlicing that introduces a new constraint-aware deep reinforcement learning (CaDRL) algorithm to learn the optimal resource orchestration policy within two steps, i.e., offline training in a simulated environment and online learning with the real network system. On optimizing the resource orchestration, we incorporate the constraints on the statistical performance of slices in the reward function using Lagrangian multipliers, and solve the Lagrangian relaxed problem via a policy network. To satisfy the constraints on the system capacity, we design a constraint network to map the latent actions generated from the policy network to the orchestration actions such that the total resources allocated to network slices do not exceed the system capacity. We prototype SafeSlicing on an end-to-end testbed developed by using OpenAirInterface LTE, OpenDayLight-based SDN, and CUDA GPU computing platform. The experimental results show that SafeSlicing reduces more than 20% resource usage while meeting SLAs of network slices as compared with other solutions.
AB - Network slicing is a promising technology that allows mobile network operators to efficiently serve various emerging use cases in 5G. It is challenging to optimize the utilization of network infrastructures while guaranteeing the performance of network slices according to service level agreements (SLAs). To solve this problem, we propose SafeSlicing that introduces a new constraint-aware deep reinforcement learning (CaDRL) algorithm to learn the optimal resource orchestration policy within two steps, i.e., offline training in a simulated environment and online learning with the real network system. On optimizing the resource orchestration, we incorporate the constraints on the statistical performance of slices in the reward function using Lagrangian multipliers, and solve the Lagrangian relaxed problem via a policy network. To satisfy the constraints on the system capacity, we design a constraint network to map the latent actions generated from the policy network to the orchestration actions such that the total resources allocated to network slices do not exceed the system capacity. We prototype SafeSlicing on an end-to-end testbed developed by using OpenAirInterface LTE, OpenDayLight-based SDN, and CUDA GPU computing platform. The experimental results show that SafeSlicing reduces more than 20% resource usage while meeting SLAs of network slices as compared with other solutions.
KW - Constraint-Awareness
KW - Deep Reinforcement Learning
KW - End-to-End Slicing
KW - Resource Orchestration
UR - http://www.scopus.com/inward/record.url?scp=85124219565&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124219565&partnerID=8YFLogxK
U2 - 10.1109/ICNP52444.2021.9651934
DO - 10.1109/ICNP52444.2021.9651934
M3 - Conference contribution
AN - SCOPUS:85124219565
T3 - Proceedings - International Conference on Network Protocols, ICNP
BT - 2021 IEEE 29th International Conference on Network Protocols, ICNP 2021
PB - IEEE Computer Society
T2 - 29th IEEE International Conference on Network Protocols, ICNP 2021
Y2 - 1 November 2021 through 5 November 2021
ER -