Publication: Data decomposition for parallel K-means clustering
Program
KU-Authors
KU Authors
Co-Authors
Editor & Affiliation
Compiler & Affiliation
Translator
Other Contributor
Date
Language
Type
Embargo Status
N/A
Journal Title
Journal ISSN
Volume Title
Alternative Title
Abstract
Developing fast algorithms for clustering has been an important area of research in data mining and other fields. K-means is one of the widely used clustering algorithms. In this work, we have developed and evaluated parallelization of k-means method for low-dimensional data on message passing computers. Three different data decomposition schemes and their impact on the pruning of distance calculations in tree-based k-means algorithm have been studied. Random pattern decomposition has good load balancing but fails to prune distance calculations effectively. Compact spatial decomposition of patterns based on space filling curves outperforms random pattern decomposition even though it has load imbalance problem. In both cases, parallel tree-based k-means clustering runs significantly faster than the traditional parallel k-means.
Source
Publisher
Springer-Verlag Berlin
Subject
Computer science, Artificial intelligence, Theory methods, Mathematics, Applied mathematics
Citation
Has Part
Source
Parallel Processing and Applied Mathematics
Book Series Title
Edition
DOI
item.page.datauri
Link
Rights
N/A
