Data: Artifacts for "CPU- and GPU-initiated Communication Strategies for Conjugate Gradient Methods on Large GPU Clusters"
Publication Date
Advisor
Institution Author
Co-Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Zenodo
Type
Abstract
This dataset contains computational artifacts related to the paper: “CPU- and GPU-initiated Communication Strategies for Conjugate Gradient Methods on Large GPU Clusters” The paper describes computational experiments that were conducted to evaluate the performance of multi-GPU iterative linear solvers based on the conjugate gradient (CG) method. The computational artifacts are located in several subdirectories: 'aCG-1.0.0/' contains the source code for aCG (version 1.0.0), which implements of the various multi-GPU CG solvers that are used for the performance benchmarks presented in the paper. 'partitions/' contains input files related to partitioning and distributing matrices that were used in the experiments. Partitions were computed using METIS (Karypis and Kumar, 1998), a multilevel graph partitioner. From the SuiteSparse Collection (Davis and Hu, 2011), six matrices were selected: Bump_2911, Cube_Coup_dt6, Flan_1565, Queen_4147, Serena and audikw_1. For each matrix, partitions are provided for 2, 4, 8, 16 and 32 parts. 'scripts/' contains job scripts for submitting jobs on three clusters: LUMI, MareNostrum 5 and Wisteria/BDEC-01 (Aquarius). These scripts carry out performance measurements for the multi-GPU CG solvers in aCG and PETSc, and were used to collect the results presented in the paper. 'results/' contains results from the performance benchmarks presented in the paper as tables in a plain-text format. References Davis, T. A. and Y. Hu. 2011. “The University of Florida Sparse Matrix Collection”. ACM Transactions on Mathematical Software 38, 1, Article 1 (December 2011), 25 pages. DOI: https://doi.org/10.1145/2049662.2049663 Karypis, G., and V. Kumar. 1998. “A fast and high quality multilevel scheme for partitioning irregular graphs”. SIAM Journal on scientific Computing 20, 1, pp. 359–392. DOI: https://doi.org/10.1137/S1064827595287997
