TY - GEN
T1 - Contour Algorithm for Connectivity
AU - Du, Zhihui
AU - Rodriguez, Oliver Alvarado
AU - Li, Fuhuan
AU - Dindoost, Mohammad
AU - Bader, David A.
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Finding connected components in a graph is a fundamental problem in graph analysis. In this work, we present a novel minimum-mapping based Contour algorithm to efficiently solve the connectivity problem. We prove that the Contour algorithm with two or higher order operators can identify all connected components of an undirected graph within O (log dmax) iterations, with each iteration involving O(m) work, where dmax represents the largest diameter among all components in the given graph, and m is the total number of edges in the graph. Importantly, each iteration is highly parallelizable, making use of the efficient minimum-mapping operator applied to all edges. To further enhance its practical performance, we optimize the Contour algorithm through asynchronous updates, early convergence checking, eliminating atomic operations, and choosing more efficient mapping operators. Our implementation of the Contour algorithm has been integrated into the open-source framework Arachne. Arachne extends Arkouda for large-scale interactive graph analytics, providing a Python API powered by the high-productivity parallel language Chapel. Experimental results on both real-world and synthetic graphs demonstrate the superior performance of our proposed Contour algorithm compared to state-of-the-art large-scale parallel algorithm FastSV and the fastest shared memory algorithm ConnectIt. On average, Contour achieves a speedup of 7.3x and 1.4x compared to FastSV and ConnectIt, respectively. All code for the Contour algorithm and the Arachne framework is publicly available on GitHub11https://githuh.comJBears-R-Us/arkouda-njit, ensuring transparency and reproducibility of our work.
AB - Finding connected components in a graph is a fundamental problem in graph analysis. In this work, we present a novel minimum-mapping based Contour algorithm to efficiently solve the connectivity problem. We prove that the Contour algorithm with two or higher order operators can identify all connected components of an undirected graph within O (log dmax) iterations, with each iteration involving O(m) work, where dmax represents the largest diameter among all components in the given graph, and m is the total number of edges in the graph. Importantly, each iteration is highly parallelizable, making use of the efficient minimum-mapping operator applied to all edges. To further enhance its practical performance, we optimize the Contour algorithm through asynchronous updates, early convergence checking, eliminating atomic operations, and choosing more efficient mapping operators. Our implementation of the Contour algorithm has been integrated into the open-source framework Arachne. Arachne extends Arkouda for large-scale interactive graph analytics, providing a Python API powered by the high-productivity parallel language Chapel. Experimental results on both real-world and synthetic graphs demonstrate the superior performance of our proposed Contour algorithm compared to state-of-the-art large-scale parallel algorithm FastSV and the fastest shared memory algorithm ConnectIt. On average, Contour achieves a speedup of 7.3x and 1.4x compared to FastSV and ConnectIt, respectively. All code for the Contour algorithm and the Arachne framework is publicly available on GitHub11https://githuh.comJBears-R-Us/arkouda-njit, ensuring transparency and reproducibility of our work.
KW - big data
KW - connected components
KW - graph analytics
KW - parallel algorithm
UR - http://www.scopus.com/inward/record.url?scp=85190583419&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85190583419&partnerID=8YFLogxK
U2 - 10.1109/HiPC58850.2023.00022
DO - 10.1109/HiPC58850.2023.00022
M3 - Conference contribution
AN - SCOPUS:85190583419
T3 - Proceedings - 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics, HiPC 2023
SP - 66
EP - 75
BT - Proceedings - 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics, HiPC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th Annual IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2023
Y2 - 18 December 2023 through 21 December 2023
ER -