Publication:
Mixed and multi-precision SpMV for GPUs with row-wise precision selection

dc.contributor.coauthorKaya, Kamer
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorErten, Didem Unat
dc.contributor.kuauthorKoşar, Fahrican
dc.contributor.kuauthorTezcan, Erhan
dc.contributor.kuauthorTorun, Tuğba
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T23:27:31Z
dc.date.issued2022
dc.description.abstractSparse Matrix-Vector Multiplication (SpMV) is one of the key memory-bound kernels commonly used in industrial and scientific applications. To improve its data movement and benefit from higher compute rates, there are several efforts to utilize mixed precision on SpMV. Most of the prior-art focus on performing the entire SpMV in single-precision within a bigger context of an iterative solver (e.g., CG, GMRES). In this work, we are interested in a more fine-grained mixed-precision SpMV, where the level of precision is decided for each element in the matrix to be used in a single operation. We extend an existing entry-wise precision based approach by deciding precisions per row, motivated by the granularity of parallelism on a GPU where groups of threads process rows in CSR-based matrices. We propose mixed-precision CSR storage methods with row permutations and describe their greater efficiency and load-balancing compared to the existing method. We also consider a multi-precision case where single and double precision copies of the matrix are stored priorly and further extend our mixed-precision SpMV approach to comply with it. As such, we leverage a mixed-precision SpMV to obtain a multi-precision Jacobi method which is faster than yet almost as accurate as double-precision Jacobi implementation, and further evaluate a multi-precision Cardiac modeling algorithm. We demonstrate the effectiveness of the proposed SpMV methods on an extensive dataset of real-valued large sparse matrices from the SuiteSparse Matrix Collection using an NVIDIA V100 GPU.
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessNO
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipEuropean High-Performance Joint Undertaking [956213]
dc.description.sponsorshipTUBITAK[120N003] This work has received funding from the European High-Performance Joint Undertaking under grant agreement no. 956213 and TUBITAKgrant no. 120N003.
dc.identifier.doi10.1109/SBAC-PAD55451.2022.00014
dc.identifier.isbn978-1-6654-5155-0
dc.identifier.issn1550-6533
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85145877084
dc.identifier.urihttps://doi.org/10.1109/SBAC-PAD55451.2022.00014
dc.identifier.urihttps://hdl.handle.net/20.500.14288/11731
dc.identifier.wos905612800004
dc.keywordsSPMV
dc.keywordsMixed-precision
dc.keywordsGPU
dc.keywordsCuda
dc.keywordsMatrix-vector multiplication
dc.keywordsConjugate gradients
dc.keywordsJacobi
dc.keywordsSolvers
dc.keywordsAcceleration
dc.keywordsComputations
dc.keywordsPerformance
dc.keywordsSystems
dc.language.isoeng
dc.publisherIEEE Computer Society
dc.relation.ispartof2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (Sbac-Pad 2022)
dc.subjectComputer science
dc.subjectComputer architecture
dc.titleMixed and multi-precision SpMV for GPUs with row-wise precision selection
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorTezcan, Erhan
local.contributor.kuauthorTorun, Tuğba
local.contributor.kuauthorKoşar, Fahrican
local.contributor.kuauthorErten, Didem Unat
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1College of Engineering
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files