Fast multidimensional reduction and broadcast operations on GPU for machine learning

Reduction and broadcast operations are commonly used in machine learning algorithms for different purposes. They widely appear in the calculation of the gradient values of a loss function, which are one of the core structures of neural networks. Both operations are implemented naively in many libraries usually for scalar reduction or broadcast; however, to our knowledge, there are no optimized multidimensional implementations available. This fact limits the performance of machine learning models requiring these operations to be performed on tensors. In this work, we address the problem and propose two new strategies that extend the existing implementations to perform on tensors. We introduce formal definitions of both operations using tensor notations, investigate their mathematical properties, and exploit these properties to provide an efficient solution for each. We implement our parallel strategies and test them on a CUDA enabled Tesla K40m GPU accelerator. Our performant implementations achieve up to 75% of the peak device memory bandwidth on different tensor sizes and dimensions. Significant speedups against the implementations available in the Knet Deep Learning framework are also achieved for both operations.

Source:

Concurrency and Computation-Practice and Experience

Publisher:

Wiley

Subject

Computer science, Software engineering, Computer science, Theory methods

URI

http://dx.doi.org/10.1002/cpe.4691
https://hdl.handle.net/20.500.14288/6138

Collections

Publications without Fulltext

Full item page

0

Views

0

Downloads

View PlumX Details

Publication:
Fast multidimensional reduction and broadcast operations on GPU for machine learning

Organizational Units

Program

KU-Authors

KU Authors

Co-Authors

Advisor

Publication Date

Language

Type

Journal Title

Journal ISSN

Volume Title

Abstract

Description

Source:

Publisher:

Keywords:

Subject

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Copy Rights Note

0

Views

0

Downloads

Publication: Fast multidimensional reduction and broadcast operations on GPU for machine learning

Organizational Units

Program

KU-Authors

KU Authors

Co-Authors

Advisor

Publication Date

Language

Type

Journal Title

Journal ISSN

Volume Title

Abstract

Description

Source:

Publisher:

Keywords:

Subject

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Copy Rights Note

0

Views

0

Downloads

Publication:
Fast multidimensional reduction and broadcast operations on GPU for machine learning