A computational-graph partitioning method for training memory-constrained DNNs

Publication:
A computational-graph partitioning method for training memory-constrained DNNs

dc.contributor.coauthor	Wahib, Mohamed
dc.contributor.coauthor	Dikbayir, Doga
dc.contributor.coauthor	Belviranli, Mehmet Esat
dc.contributor.department	Department of Computer Engineering
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.kuauthor	Erten, Didem Unat
dc.contributor.kuauthor	Qararyah, Fareed Mohammad
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2024-11-09T22:56:38Z
dc.date.issued	2021
dc.description.abstract	Many state-of-the-art Deep Neural Networks (DNNs) have substantial memory requirements. Limited device memory becomes a bottleneck when training those models. We propose ParDNN, an automatic, generic, and non-intrusive partitioning strategy for DNNs that are represented as computational graphs. ParDNN decides a placement of DNN's underlying computational graph operations across multiple devices so that the devices' memory constraints are met and the training time is minimized. ParDNN is completely independent of the deep learning aspects of a DNN. It requires no modification neither at the model nor at the systems level implementation of its operation kernels. ParDNN partitions DNNs having billions of parameters and hundreds of thousands of operations in seconds to few minutes. Our experiments with TensorFlow on 16 GPUs demonstrate efficient training of 5 very large models while achieving superlinear scaling for both the batch size and training throughput. ParDNN either outperforms or qualitatively improves upon the related work.
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	N/A
dc.description.sponsorship	Turkish Science and Technology Research Centre [118E801]
dc.description.sponsorship	JST-CREST [JPMJCR19F5]
dc.description.sponsorship	Research Council of Norway [270053] Authors from Koc University are supported by the Turkish Science and Technology Research Centre Grant No: 118E801. This work was partially supported by JST-CREST under Grant Number JPMJCR19F5. The research presented in this paper has benefited from the Experimental Infrastructure for Exploration of Exascale Computing (eX3), which is financially supported by the Research Council of Norway under contract 270053.
dc.description.volume	104
dc.identifier.doi	10.1016/j.parco.2021.102792
dc.identifier.eissn	1872-7336
dc.identifier.issn	0167-8191
dc.identifier.quartile	Q2
dc.identifier.scopus	2-s2.0-85105319626
dc.identifier.uri	https://doi.org/10.1016/j.parco.2021.102792
dc.identifier.uri	https://hdl.handle.net/20.500.14288/7414
dc.identifier.wos	654719400005
dc.keywords	Dnn
dc.keywords	Graph partitioning
dc.keywords	Model parallelism
dc.language.iso	eng
dc.publisher	Elsevier
dc.relation.ispartof	Parallel Computing
dc.subject	Computer science
dc.title	A computational-graph partitioning method for training memory-constrained DNNs
dc.type	Journal Article
dspace.entity.type	Publication
local.contributor.kuauthor	Qararyah, Fareed Mohammad
local.contributor.kuauthor	Erten, Didem Unat
local.publication.orgunit1	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1	College of Engineering
local.publication.orgunit2	Department of Computer Engineering
local.publication.orgunit2	Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files

Original bundle

Now showing 1 - 1 of 1

Name:: IR04187.pdf
Size:: 855.57 KB
Format:: Adobe Portable Document Format

Download

Collections

Publications with Fulltext

Publication: A computational-graph partitioning method for training memory-constrained DNNs

Files

Original bundle

Collections

Publication:
A computational-graph partitioning method for training memory-constrained DNNs