TY - JOUR
T1 - Enhancing Proportional IO Sharing on Containerized Big Data File Systems
AU - Huang, Dan
AU - Wang, Jun
AU - Liu, Qing
AU - Xiao, Nong
AU - Wu, Huafeng
AU - Yin, Jiangling
N1 - Funding Information:
This project was supported in part by the US National Science Foundation under Grant CCF-1717388, 1907765 and 2028481, and Contract New Jersey Institute of Technology Research Startup.
Publisher Copyright:
© 1968-2012 IEEE.
PY - 2021/12/1
Y1 - 2021/12/1
N2 - Big Data platforms recently employ resource management systems, such as YARN, Mesos, and Google Borg, to provision computational resources. These systems adopt containerization to share the computing resources in a multi-tenant setting with low performance overhead and interference. However, it may be observed that tenants often interfere with each other on the underlying Big Data File Systems (BDFS), e.g., Hadoop File System, which have been widely deployed as a persistent layer in current data centers. A solution with systematic generality is to containerize BDFS itself to isolate and allocate its IO sources to multiple tenants. To this end, we conduct analysis on the ineffectiveness of proportionally sharing BDFS IO resource via containerization. This ineffectiveness is due to the scheduler of containerization in 'pseudo-starvation' status, in which most of IO requests are backlogged in BDFS rather than in containerization scheduler. Without enough backlogged IO requests, existing schedulers might have to maximize device utilization rather than enforce proportional sharing policy. To resolve this ineffectiveness issue, we develop a cross-layer system called BDFS-Container, which containerizes BDFS at the Linux block IO level. Central to BDFS-Container, we propose and design a proactive IOPS throttling-based mechanism named IOPS Regulator, which achieves a trade-off between maximizing IO utilization and accurately proportional IO sharing. The evaluation results show that our method can improve proportionally sharing BDFS IO resources by 74.4 percent on average.
AB - Big Data platforms recently employ resource management systems, such as YARN, Mesos, and Google Borg, to provision computational resources. These systems adopt containerization to share the computing resources in a multi-tenant setting with low performance overhead and interference. However, it may be observed that tenants often interfere with each other on the underlying Big Data File Systems (BDFS), e.g., Hadoop File System, which have been widely deployed as a persistent layer in current data centers. A solution with systematic generality is to containerize BDFS itself to isolate and allocate its IO sources to multiple tenants. To this end, we conduct analysis on the ineffectiveness of proportionally sharing BDFS IO resource via containerization. This ineffectiveness is due to the scheduler of containerization in 'pseudo-starvation' status, in which most of IO requests are backlogged in BDFS rather than in containerization scheduler. Without enough backlogged IO requests, existing schedulers might have to maximize device utilization rather than enforce proportional sharing policy. To resolve this ineffectiveness issue, we develop a cross-layer system called BDFS-Container, which containerizes BDFS at the Linux block IO level. Central to BDFS-Container, we propose and design a proactive IOPS throttling-based mechanism named IOPS Regulator, which achieves a trade-off between maximizing IO utilization and accurately proportional IO sharing. The evaluation results show that our method can improve proportionally sharing BDFS IO resources by 74.4 percent on average.
KW - Containerization
KW - big data storage
KW - hadoop file system
KW - resource sharing
UR - http://www.scopus.com/inward/record.url?scp=85098783193&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098783193&partnerID=8YFLogxK
U2 - 10.1109/TC.2020.3037078
DO - 10.1109/TC.2020.3037078
M3 - Article
AN - SCOPUS:85098783193
SN - 0018-9340
VL - 70
SP - 2083
EP - 2097
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 12
ER -