Publication:
Data decomposition for parallel K-means clustering

Placeholder

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Developing fast algorithms for clustering has been an important area of research in data mining and other fields. K-means is one of the widely used clustering algorithms. In this work, we have developed and evaluated parallelization of k-means method for low-dimensional data on message passing computers. Three different data decomposition schemes and their impact on the pruning of distance calculations in tree-based k-means algorithm have been studied. Random pattern decomposition has good load balancing but fails to prune distance calculations effectively. Compact spatial decomposition of patterns based on space filling curves outperforms random pattern decomposition even though it has load imbalance problem. In both cases, parallel tree-based k-means clustering runs significantly faster than the traditional parallel k-means.

Source

Publisher

Springer-Verlag Berlin

Subject

Computer science, Artificial intelligence, Theory methods, Mathematics, Applied mathematics

Citation

Has Part

Source

Parallel Processing and Applied Mathematics

Book Series Title

Edition

DOI

item.page.datauri

Link

Rights

Copyrights Note

Endorsement

Review

Supplemented By

Referenced By

0

Views

0

Downloads