A split execution model for SpTRSV

Publication:
A split execution model for SpTRSV

dc.contributor.coauthor	Yilmaz, Buse
dc.contributor.department	Department of Computer Engineering
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.facultymember	Yes
dc.contributor.kuauthor	Ahmad, Najeeb
dc.contributor.kuauthor	Erten, Didem Unat
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2024-11-09T23:47:52Z
dc.date.issued	2021
dc.description.abstract	Sparse Triangular Solve (SpTRSV) is an important and extensively used kernel in scientific computing. Parallelism within SpTRSV depends upon matrix sparsity pattern and, in many cases, is non-uniform from one computational step to the next. In cases where the SpTRSV computational steps have contrasting parallelism characteristics- some steps are more parallel, others more sequential in nature, the performance of an SpTRSV algorithm may be limited by the contrasting parallelism characteristics. In this work, we propose a split-execution model for SpTRSV to automatically divide SpTRSV computation into two sub-SpTRSV systems and an SpMV, such that one of the sub-SpTRSVs has more parallelism than the other. Each sub-SpTRSV is then computed using different SpTRSV algorithms, which are possibly executed on different platforms (CPU or GPU). By analyzing the SpTRSV Directed Acyclic Graph (DAG) and matrix sparsity features, we use a heuristics-based approach to (i) automatically determine the suitability of an SpTRSV for split-execution, (ii) find the appropriate split-point, and (iii) execute SpTRSV in a split fashion using two SpTRSV algorithms while managing any required inter-platform communication. Experimental evaluation of the execution model on two CPU-GPU machines with a matrix dataset of 327 matrices from the SuiteSparse Matrix Collection shows that our approach correctly selects the fastest SpTRSV method (split or unsplit) for 88 percent of matrices on the Intel Xeon Gold (6148) + NVIDIA Tesla V100 and 83 percent on the Intel Core I7 + NVIDIA G1080 Ti platform achieving speedups up to 10x and 6.36x respectively.
dc.description.fulltext	No
dc.description.harvestedfrom	Manual
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.openaccess	NO
dc.description.peerreviewstatus	N/A
dc.description.publisherscope	International
dc.description.readpublish	N/A
dc.description.sponsoredbyTubitakEu	N/A
dc.description.sponsorship	Aramco Overseas Company
dc.description.sponsorship	Saudi Aramco The authors would like to thank Aramco Overseas Company and Saudi Aramco for funding this work.
dc.description.version	N/A
dc.identifier.doi	10.1109/TPDS.2021.3074501
dc.identifier.eissn	1558-2183
dc.identifier.embargo	N/A
dc.identifier.issn	1045-9219
dc.identifier.quartile	Q1
dc.identifier.scopus	2-s2.0-85104653673
dc.identifier.uri	https://doi.org/10.1109/TPDS.2021.3074501
dc.identifier.uri	https://hdl.handle.net/20.500.14288/14188
dc.identifier.wos	655244100005
dc.keywords	Sparse matrices
dc.keywords	Parallel algorithms
dc.keywords	Computational modeling
dc.keywords	Kernel
dc.keywords	Graphics processing units
dc.keywords	Fats
dc.keywords	Phased arrays
dc.keywords	Sparse triangular solve
dc.keywords	CPU-GPU computing
dc.keywords	Heterogeneous computing
dc.keywords	Sparse linear systems
dc.keywords	SpTRSV
dc.keywords	SpTS
dc.language.iso	eng
dc.publisher	IEEE Computer Society
dc.relation.affiliation	Koç University
dc.relation.collection	Koç University Institutional Repository
dc.relation.ispartof	IEEE Transactions on Parallel and Distributed Systems
dc.relation.openaccess	N/A
dc.rights	N/A
dc.subject	Computer science, theory and methods
dc.subject	Engineering, electrical and electronic
dc.title	A split execution model for SpTRSV
dc.type	Journal Article
dspace.entity.type	Publication
local.contributor.kuauthor	Ahmad, Najeeb
local.contributor.kuauthor	Erten, Didem Unat
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Collections

Publications without Fulltext

Publication: A split execution model for SpTRSV

Files

Collections

Publication:
A split execution model for SpTRSV