TY - GEN
T1 - Resources-Conscious Asynchronous High-Speed Data Transfer in Multicore Systems
T2 - 29th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015
AU - Li, Tan
AU - Ren, Yufei
AU - Yu, Dantong
AU - Jin, Shudong
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/7/17
Y1 - 2015/7/17
N2 - One constant challenge in multicourse systems is to utilize fully the abundant resources, while assuring superior performance for individual tasks, particularly, in Non-uniform Memory Access (NUMA) systems where the locality of access is an important factor. To achieve this goal requires rethinking how to exploit parallel data access and I/O related optimizations. In the context of developing software for high-speed data transfer, we offer a novel design using asynchronous processing, and detail the advantages of resources-conscious task scheduling. In our design, multiple sets of threads are allocated to the different stages of the processing pipeline based on the capacity of resources, including storage I/O, and network communication operations. The threads in these stages are executed in an asynchronous mode, and they communicate efficiently via localized mechanisms in NUMA systems, e.g., task grouping, buffer memory, and locks. With this design, multiple effective optimizations are seamlessly integrated particularly for improving the performance and scalability of end-to-end data transfer. To validate the benefits of the design and optimizations therein, we conducted extensive experiments on the state-of-the-art multicourse systems. Our results highlighted the performance advantages of our software across different typical workloads, compared to the widely adopted data transfer tools, Graft and BBCP.
AB - One constant challenge in multicourse systems is to utilize fully the abundant resources, while assuring superior performance for individual tasks, particularly, in Non-uniform Memory Access (NUMA) systems where the locality of access is an important factor. To achieve this goal requires rethinking how to exploit parallel data access and I/O related optimizations. In the context of developing software for high-speed data transfer, we offer a novel design using asynchronous processing, and detail the advantages of resources-conscious task scheduling. In our design, multiple sets of threads are allocated to the different stages of the processing pipeline based on the capacity of resources, including storage I/O, and network communication operations. The threads in these stages are executed in an asynchronous mode, and they communicate efficiently via localized mechanisms in NUMA systems, e.g., task grouping, buffer memory, and locks. With this design, multiple effective optimizations are seamlessly integrated particularly for improving the performance and scalability of end-to-end data transfer. To validate the benefits of the design and optimizations therein, we conducted extensive experiments on the state-of-the-art multicourse systems. Our results highlighted the performance advantages of our software across different typical workloads, compared to the widely adopted data transfer tools, Graft and BBCP.
KW - Asynchronous processing
KW - High-speed data transfer
KW - Input/Output
KW - Parallelism
UR - http://www.scopus.com/inward/record.url?scp=84971421964&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84971421964&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2015.65
DO - 10.1109/IPDPS.2015.65
M3 - Conference contribution
AN - SCOPUS:84971421964
T3 - Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015
SP - 1097
EP - 1106
BT - Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 May 2015 through 29 May 2015
ER -