Publication: TiDA: High-level programming abstractions for data locality management
dc.contributor.coauthor | Nguyen, Tan | |
dc.contributor.coauthor | Zhang, Weiqun | |
dc.contributor.coauthor | Michelogiannakis, George | |
dc.contributor.coauthor | Almgren, Ann | |
dc.contributor.coauthor | Shalf,John | |
dc.contributor.department | N/A | |
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.department | N/A | |
dc.contributor.kuauthor | Farooqi, Muhammad Nufail | |
dc.contributor.kuauthor | Erten, Didem Unat | |
dc.contributor.kuauthor | Bastem, Burak | |
dc.contributor.kuprofile | PhD Student | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | Master Student | |
dc.contributor.other | Department of Computer Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | 219275 | |
dc.contributor.yokid | N/A | |
dc.date.accessioned | 2024-11-09T23:12:32Z | |
dc.date.issued | 2016 | |
dc.description.abstract | The high energy costs for data movement compared to computation gives paramount importance to data locality management in programs. Managing data locality manually is not a trivial task and also complicates programming. Tiling is a well-known approach that provides both data locality and parallelism in an application. However, there is no standard programming construct to express tiling at the application level. We have developed a multicore programming model, TiDA, based on tiling and implemented the model as C++ and Fortran libraries. The proposed programming model has three high level abstractions, tiles, regions and tile iterator. These abstractions in the library hide the details of data decomposition, cache locality optimizations, and memory affinity management in the application. In this paper we unveil the internals of the library and demonstrate the performance and programability advantages of the model on five applications on multiple NUMA nodes. The library achieves up to 2.10x speedup over OpenMP in a single compute node for simple kernels, and up to 22x improvement over a single thread for a more complex combustion proxy application (SMC) on 24 cores. The MPI+TiDA implementation of geometric multigrid demonstrates a 30.9% performance improvement over MPI+OpenMP when scaling to 3072 cores (excluding MPI communication overheads, 8.5% otherwise). | |
dc.description.indexedby | WoS | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | NO | |
dc.description.volume | 9697 | |
dc.identifier.doi | 10.1007/978-3-319-41321-1_7 | |
dc.identifier.eissn | 1611-3349 | |
dc.identifier.isbn | 978-3-319-41321-1 | |
dc.identifier.isbn | 978-3-319-41320-4 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.scopus | 2-s2.0-84977591727 | |
dc.identifier.uri | http://dx.doi.org/10.1007/978-3-319-41321-1_7 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/9833 | |
dc.identifier.wos | 386513900007 | |
dc.language | English | |
dc.publisher | Springer International Publishing Ag | |
dc.source | High Performance Computing | |
dc.subject | Computer science | |
dc.subject | Theory methods | |
dc.title | TiDA: High-level programming abstractions for data locality management | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-1609-5847 | |
local.contributor.authorid | 0000-0002-2351-0771 | |
local.contributor.authorid | N/A | |
local.contributor.kuauthor | Farooqi, Muhammad Nufail | |
local.contributor.kuauthor | Erten, Didem Unat | |
local.contributor.kuauthor | Bastem, Burak | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae |