TY - GEN
T1 - A Deployment Tool for Large Scale Graph Analytics Framework Arachne
AU - Gonzalez-Rivas, Garrett
AU - Du, Zhihui
AU - Bader, David A.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Data sets have grown exponentially in size, rapidly exceeding the scale at which traditional exploratory data analysis (EDA) tools can be used effectively to analyze real-world graphs. This led to the development of Arachne, a user-friendly tool enabling interactive graph analysis at terabyte scales while using familiar Python code and utilizing a high-performance back-end powered by Chapel that can be run on nearly any *nix-like system. Various disciplines, including biological, information, and social sciences, use large-scale graphs to represent the flow of information through a cell, connections between neurons, interactions between computers, relationships between individuals, etc. However, to take advantage of Arachne, a new user has to go through a long and convoluted installation process, which often takes a week or more to complete, even with assistance from the developers. To support Arachne's mission of being an easy-to-use exploratory graph analytics tool that increases accessibility to high performance computing (HPC) resources, a better deployment experience was needed for users and developers. In this paper, we propose a tool specially designed to greatly simplify the deployment of Arachne for users and offer the ability to rapidly and automatically test the software for compatibility with new releases of its dependencies. The highly portable nature of Arachne necessitates that this deployment tool be able to install and configure the software in diverse combinations of hardware, operating system, initial system environment, and handle evolving packages and libraries in Arachne. The tool was tested in virtual and real-world environments, where its success was evaluated by an improvement to efficiency and productivity for both users and developers. Current results show that the installation and configuration process has been greatly improved, with a significant reduction in the time and effort spent by both users and developers.
AB - Data sets have grown exponentially in size, rapidly exceeding the scale at which traditional exploratory data analysis (EDA) tools can be used effectively to analyze real-world graphs. This led to the development of Arachne, a user-friendly tool enabling interactive graph analysis at terabyte scales while using familiar Python code and utilizing a high-performance back-end powered by Chapel that can be run on nearly any *nix-like system. Various disciplines, including biological, information, and social sciences, use large-scale graphs to represent the flow of information through a cell, connections between neurons, interactions between computers, relationships between individuals, etc. However, to take advantage of Arachne, a new user has to go through a long and convoluted installation process, which often takes a week or more to complete, even with assistance from the developers. To support Arachne's mission of being an easy-to-use exploratory graph analytics tool that increases accessibility to high performance computing (HPC) resources, a better deployment experience was needed for users and developers. In this paper, we propose a tool specially designed to greatly simplify the deployment of Arachne for users and offer the ability to rapidly and automatically test the software for compatibility with new releases of its dependencies. The highly portable nature of Arachne necessitates that this deployment tool be able to install and configure the software in diverse combinations of hardware, operating system, initial system environment, and handle evolving packages and libraries in Arachne. The tool was tested in virtual and real-world environments, where its success was evaluated by an improvement to efficiency and productivity for both users and developers. Current results show that the installation and configuration process has been greatly improved, with a significant reduction in the time and effort spent by both users and developers.
KW - graph analytics
KW - large-scale graph data
KW - open-source framework
KW - software deployment
UR - http://www.scopus.com/inward/record.url?scp=105002714397&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105002714397&partnerID=8YFLogxK
U2 - 10.1109/HPEC62836.2024.10938515
DO - 10.1109/HPEC62836.2024.10938515
M3 - Conference contribution
AN - SCOPUS:105002714397
T3 - 2024 IEEE High Performance Extreme Computing Conference, HPEC 2024
BT - 2024 IEEE High Performance Extreme Computing Conference, HPEC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE High Performance Extreme Computing Conference, HPEC 2024
Y2 - 23 September 2024 through 27 September 2024
ER -