TY - GEN
T1 - Investigating graph algorithms in the BSP model on the cray XMT
AU - Ediger, David
AU - Bader, David A.
PY - 2013
Y1 - 2013
N2 - Implementing parallel graph algorithms in large, shared memory machines, such as the Cray XMT, can be challenging for programmers. Synchronization, deadlock, hot spotting, and others can be barriers to obtaining linear scalability. Alternative programming models, such as the bulk synchronous parallel programming model used in Google's Pregel, have been proposed for large graph analytics. This model eases programmer complexity by eliminating deadlock and simplifying data sharing. We investigate the algorithmic effects of the BSP model for graph algorithms and compare performance and scalability with hand-tuned, open source software using GraphCT. We analyze the innermost iterations of these algorithms to understand the differences in scalability between BSP and shared memory algorithms. We show that scalable performance can be obtained with graph algorithms in the BSP model on the Cray XMT. These algorithms perform within a factor of 10 of hand-tuned C code.
AB - Implementing parallel graph algorithms in large, shared memory machines, such as the Cray XMT, can be challenging for programmers. Synchronization, deadlock, hot spotting, and others can be barriers to obtaining linear scalability. Alternative programming models, such as the bulk synchronous parallel programming model used in Google's Pregel, have been proposed for large graph analytics. This model eases programmer complexity by eliminating deadlock and simplifying data sharing. We investigate the algorithmic effects of the BSP model for graph algorithms and compare performance and scalability with hand-tuned, open source software using GraphCT. We analyze the innermost iterations of these algorithms to understand the differences in scalability between BSP and shared memory algorithms. We show that scalable performance can be obtained with graph algorithms in the BSP model on the Cray XMT. These algorithms perform within a factor of 10 of hand-tuned C code.
UR - http://www.scopus.com/inward/record.url?scp=84899706389&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84899706389&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2013.107
DO - 10.1109/IPDPSW.2013.107
M3 - Conference contribution
AN - SCOPUS:84899706389
SN - 9780769549798
T3 - Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013
SP - 1638
EP - 1645
BT - Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013
PB - IEEE Computer Society
T2 - 2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013
Y2 - 22 July 2013 through 26 July 2013
ER -