TY - GEN
T1 - Enabling Exploratory Large Scale Graph Analytics through Arkouda
AU - Du, Zhihui
AU - Rodriguez, Oliver Alvarado
AU - Bader, David A.
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Exploratory graph analytics helps maximize the informational value from a graph. However, increasing graph sizes makes it impossible for existing popular exploratory data analysis tools to handle dozens of terabytes or even larger data sets in the memory of a common laptop/personal computer. Arkouda is a framework under early development that brings together the productivity of Python at the user-side with the high performance of Chapel at the server-side. In this paper, we present our initial work on overcoming the memory limit and high-performance computing coding roadblocks for high-level Python users to perform large graph analyses. Based on a simple and succinct graph data structure, a high-level Chapel-based graph algorithm, Breadth-First Search (BFS), is presented to show the scalable and parallel graph algorithm development method in a productive way through Arkouda. The reverse Cuthill-McKee (RCM) algorithm is implemented in Chapel to relabel the vertices of a graph as a preprocessing step to improve the performance of BFS and one low-level BFS algorithm is also developed to compare with the performance of high-level method. Both synthetic graphs and typical graph benchmarks are used to evaluate the performance of the provided graph algorithms. The experimental results show that, based on the proposed high-level algorithm framework, the performance of BFS can be improved significantly and easily by simply selecting suitable Chapel high-level data structures and parallel constructs. Our code is open source and available from GitHub (https://github.com/Bader-Research/arkouda).
AB - Exploratory graph analytics helps maximize the informational value from a graph. However, increasing graph sizes makes it impossible for existing popular exploratory data analysis tools to handle dozens of terabytes or even larger data sets in the memory of a common laptop/personal computer. Arkouda is a framework under early development that brings together the productivity of Python at the user-side with the high performance of Chapel at the server-side. In this paper, we present our initial work on overcoming the memory limit and high-performance computing coding roadblocks for high-level Python users to perform large graph analyses. Based on a simple and succinct graph data structure, a high-level Chapel-based graph algorithm, Breadth-First Search (BFS), is presented to show the scalable and parallel graph algorithm development method in a productive way through Arkouda. The reverse Cuthill-McKee (RCM) algorithm is implemented in Chapel to relabel the vertices of a graph as a preprocessing step to improve the performance of BFS and one low-level BFS algorithm is also developed to compare with the performance of high-level method. Both synthetic graphs and typical graph benchmarks are used to evaluate the performance of the provided graph algorithms. The experimental results show that, based on the proposed high-level algorithm framework, the performance of BFS can be improved significantly and easily by simply selecting suitable Chapel high-level data structures and parallel constructs. Our code is open source and available from GitHub (https://github.com/Bader-Research/arkouda).
KW - Breadth-First Search
KW - Exploratory graph analysis
KW - High-Performance Computing
KW - Parallel graph algorithms
UR - http://www.scopus.com/inward/record.url?scp=85123488201&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123488201&partnerID=8YFLogxK
U2 - 10.1109/HPEC49654.2021.9622860
DO - 10.1109/HPEC49654.2021.9622860
M3 - Conference contribution
AN - SCOPUS:85123488201
T3 - 2021 IEEE High Performance Extreme Computing Conference, HPEC 2021
BT - 2021 IEEE High Performance Extreme Computing Conference, HPEC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE High Performance Extreme Computing Conference, HPEC 2021
Y2 - 20 September 2021 through 24 September 2021
ER -