Publication:
Monitoring collective communication among GPUs

dc.contributor.coauthorN/A
dc.contributor.departmentN/A
dc.contributor.departmentN/A
dc.contributor.departmentN/A
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorSoytürk, Muhammet Abdullah
dc.contributor.kuauthorAkhtar, Palwisha
dc.contributor.kuauthorTezcan, Erhan
dc.contributor.kuauthorErten, Didem Unat
dc.contributor.kuprofilePhD Student
dc.contributor.kuprofileMaster Student
dc.contributor.kuprofileMaster Student
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokidN/A
dc.contributor.yokidN/A
dc.contributor.yokidN/A
dc.contributor.yokid219274
dc.date.accessioned2024-11-09T23:28:07Z
dc.date.issued2022
dc.description.abstractCommunication among devices in multi-GPU systems plays an important role in terms of performance and scalability. In order to optimize an application, programmers need to know the type and amount of the communication happening among GPUs. Although there are prior works to gather this information in MPI applications on distributed systems and multi-threaded applications on shared memory systems, there is no tool that identifies communication among GPUs. Our prior work, CoMSCRIBE, presents a point-to-point (P2P) communication detection tool for GPUs sharing a common host. In this work, we extend CoMSCRIBE to identify communication among GPUs for collective and P2P communication primitives in NVIDIA's NCCL library. In addition to P2P communications, collective communications are commonly used in HPC and AI workloads thus it is important to monitor the induced data movement due to collectives. Our tool extracts the size and the frequency of data transfers in an application and visualizes them as a communication matrix. To demonstrate the tool in action, we present communication matrices and some statistics for two applications coming from machine translation and image classification domains.
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TUBITAK) [120E492]
dc.description.sponsorshipRoyal Society-Newton Advanced Fellowship The work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK), Grant no. 120E492. Dr. Didem Unat is supported by the Royal Society-Newton Advanced Fellowship.
dc.description.volume13098
dc.identifier.doi10.1007/978-3-031-06156-1_4
dc.identifier.eissn1611-3349
dc.identifier.isbn978-3-031-06156-1
dc.identifier.isbn978-3-031-06155-4
dc.identifier.issn0302-9743
dc.identifier.scopus2-s2.0-85133002081
dc.identifier.urihttp://dx.doi.org/10.1007/978-3-031-06156-1_4
dc.identifier.urihttps://hdl.handle.net/20.500.14288/11834
dc.identifier.wos851509300004
dc.keywordsInter-GPU communication
dc.keywordsMulti-GPUs
dc.keywordsProfiling
dc.languageEnglish
dc.publisherSpringer International Publishing Ag
dc.sourceEuro-Par 2021: Parallel Processing Workshops
dc.subjectComputer science
dc.titleMonitoring collective communication among GPUs
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authorid0000-0002-2880-0857
local.contributor.authorid0000-0003-0279-031X
local.contributor.authorid0000-0001-5129-4166
local.contributor.authorid0000-0002-2351-0770
local.contributor.kuauthorSoytürk, Muhammet Abdullah
local.contributor.kuauthorAkhtar, Palwisha
local.contributor.kuauthorTezcan, Erhan
local.contributor.kuauthorErten, Didem Unat
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files