Publication: A prediction framework for fast sparse triangular solves
dc.contributor.department | N/A | |
dc.contributor.department | N/A | |
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.kuauthor | Ahmad, Najeeb | |
dc.contributor.kuauthor | Yılmaz, Buse | |
dc.contributor.kuauthor | Erten, Didem Unat | |
dc.contributor.kuprofile | PhD Student | |
dc.contributor.kuprofile | N/A | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.other | Department of Computer Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.schoolcollegeinstitute | N/A | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | 219274 | |
dc.date.accessioned | 2024-11-09T23:14:05Z | |
dc.date.issued | 2020 | |
dc.description.abstract | Sparse triangular solve (SpTRSV) is an important linear algebra kernel, finding extensive uses in numerical and scientific computing. The parallel implementation of SpTRSV is a challenging task due to the sequential nature of the steps involved. This makes it, in many cases, one of the most time-consuming operations in an application. Many approaches for efficient SpTRSV on CPU and GPU systems have been proposed in the literature. However, no single implementation or platform (CPU or GPU) gives the fastest solution for all input sparse matrices. In this work, we propose a machine learning-based framework to predict the SpTRSV implementation giving the fastest execution time for a given sparse matrix based on its structural features. The framework is tested with six SpTRSV implementations on a state-of-the-art CPU-GPU machine (Intel Xeon Gold CPU, NVIDIA V100 GPU). Experimental results, with 998 matrices taken from the SuiteSparse Matrix Collection, show the classifier prediction accuracy of 87% for the fastest SpTRSV algorithm for a given input matrix. Predicted SpTRSV implementations achieve average speedups (harmonic mean) in the range of 1.4-2.7x against the six SpTRSV implementations used in the evaluation. | |
dc.description.indexedby | WoS | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | NO | |
dc.description.sponsorship | Aramco Overseas Company | |
dc.description.sponsorship | SaudiAramco Authors would like to thank Aramco Overseas Company and SaudiAramco for funding this research. | |
dc.description.volume | 12247 | |
dc.identifier.doi | 10.1007/978-3-030-57675-2_33 | |
dc.identifier.eissn | 1611-3349 | |
dc.identifier.isbn | 978-3-030-57675-2 | |
dc.identifier.isbn | 978-3-030-57674-5 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.scopus | 2-s2.0-85090094281 | |
dc.identifier.uri | http://dx.doi.org/10.1007/978-3-030-57675-2_33 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/10091 | |
dc.identifier.wos | 851325900033 | |
dc.keywords | Performance prediction | |
dc.keywords | Sparse triangular solve | |
dc.keywords | Heterogeneous systems | |
dc.keywords | Performance autotuning parallel solution | |
dc.language | English | |
dc.publisher | Springer International Publishing Ag | |
dc.source | Euro-Par 2020: Parallel Processing | |
dc.subject | Computer science | |
dc.subject | Hardware architecture | |
dc.subject | Engineering | |
dc.subject | Software engineering | |
dc.title | A prediction framework for fast sparse triangular solves | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-3460-1256 | |
local.contributor.authorid | N/A | |
local.contributor.authorid | 0000-0002-2351-0770 | |
local.contributor.kuauthor | Ahmad, Najeeb | |
local.contributor.kuauthor | Yılmaz, Buse | |
local.contributor.kuauthor | Erten, Didem Unat | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae |