Publication:
Mixed and multi-precision SpMV for GPUs with row-wise precision selection

dc.contributor.coauthorKaya, Kamer
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorTezcan, Erhan
dc.contributor.kuauthorTorun, Tuğba
dc.contributor.kuauthorKoşar, Fahrican
dc.contributor.kuauthorErten, Didem Unat
dc.contributor.kuprofileMaster Student
dc.contributor.kuprofileResearcher
dc.contributor.kuprofileMaster Student
dc.contributor.kuprofileFaculty Member
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokidN/A
dc.contributor.yokidN/A
dc.contributor.yokidN/A
dc.contributor.yokid219274
dc.date.accessioned2024-11-09T23:27:31Z
dc.date.issued2022
dc.description.abstractSparse Matrix-Vector Multiplication (SpMV) is one of the key memory-bound kernels commonly used in industrial and scientific applications. To improve its data movement and benefit from higher compute rates, there are several efforts to utilize mixed precision on SpMV. Most of the prior-art focus on performing the entire SpMV in single-precision within a bigger context of an iterative solver (e.g., CG, GMRES). In this work, we are interested in a more fine-grained mixed-precision SpMV, where the level of precision is decided for each element in the matrix to be used in a single operation. We extend an existing entry-wise precision based approach by deciding precisions per row, motivated by the granularity of parallelism on a GPU where groups of threads process rows in CSR-based matrices. We propose mixed-precision CSR storage methods with row permutations and describe their greater efficiency and load-balancing compared to the existing method. We also consider a multi-precision case where single and double precision copies of the matrix are stored priorly and further extend our mixed-precision SpMV approach to comply with it. As such, we leverage a mixed-precision SpMV to obtain a multi-precision Jacobi method which is faster than yet almost as accurate as double-precision Jacobi implementation, and further evaluate a multi-precision Cardiac modeling algorithm. We demonstrate the effectiveness of the proposed SpMV methods on an extensive dataset of real-valued large sparse matrices from the SuiteSparse Matrix Collection using an NVIDIA V100 GPU.
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.openaccessNO
dc.description.publisherscopeInternational
dc.description.sponsorshipEuropean High-Performance Joint Undertaking [956213]
dc.description.sponsorshipTUBITAK[120N003] This work has received funding from the European High-Performance Joint Undertaking under grant agreement no. 956213 and TUBITAKgrant no. 120N003.
dc.identifier.doi10.1109/SBAC-PAD55451.2022.00014
dc.identifier.isbn978-1-6654-5155-0
dc.identifier.issn1550-6533
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85145877084
dc.identifier.urihttp://dx.doi.org/10.1109/SBAC-PAD55451.2022.00014
dc.identifier.urihttps://hdl.handle.net/20.500.14288/11731
dc.identifier.wos905612800004
dc.keywordsSPMV
dc.keywordsMixed-precision
dc.keywordsGPU
dc.keywordsCuda
dc.keywordsMatrix-vector multiplication
dc.keywordsConjugate gradients
dc.keywordsJacobi
dc.keywordsSolvers
dc.keywordsAcceleration
dc.keywordsComputations
dc.keywordsPerformance
dc.keywordsSystems
dc.languageEnglish
dc.publisherIEEE Computer Society
dc.source2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (Sbac-Pad 2022)
dc.subjectComputer science
dc.subjectComputer architecture
dc.titleMixed and multi-precision SpMV for GPUs with row-wise precision selection
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authorid0000-0001-5129-4166
local.contributor.authorid0000-0001-6790-7094
local.contributor.authorid0000-0002-9485-5610
local.contributor.authorid0000-0002-2351-0770
local.contributor.kuauthorTezcan, Erhan
local.contributor.kuauthorTorun, Tuğba
local.contributor.kuauthorKoşar, Fahrican
local.contributor.kuauthorErten, Didem Unat
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files