TY - GEN
T1 - Speeding up k-means algorithm by GPUs
AU - Li, You
AU - Zhao, Kaiyong
AU - CHU, Xiaowen
AU - LIU, Jiming
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2010
Y1 - 2010
N2 - Cluster analysis plays a critical role in a wide variety of applications, but it is now facing the computational challenge due to the continuously increasing data volume. Parallel computing is one of the most promising solutions to overcoming the computational challenge. In this paper, we target at parallelizing k-Means, which is one of the most popular clustering algorithms, by using the widely available Graphics Processing Units (GPUs). Different from existing GPU-based k-Means algorithms, we observe that data dimension is an important factor that should be taken into consideration when parallelizing k-Means on GPUs. In particular, we use two different strategies for low-dimensional data sets and high-dimensional data sets respectively, in order to make the best use of the power of GPUs. For low-dimensional data sets, we exploit GPU on-chip registers to significantly decrease data access latency. For highdimensional data sets, we design a novel algorithm which simulates matrix multiplication and exploits GPU on-chip registers and also on-chip shared memory to achieve high compute-to-memory-access ratio. As a result, our GPU-based k-Means algorithm is three to eight times faster than the best reported GPU-based algorithm.
AB - Cluster analysis plays a critical role in a wide variety of applications, but it is now facing the computational challenge due to the continuously increasing data volume. Parallel computing is one of the most promising solutions to overcoming the computational challenge. In this paper, we target at parallelizing k-Means, which is one of the most popular clustering algorithms, by using the widely available Graphics Processing Units (GPUs). Different from existing GPU-based k-Means algorithms, we observe that data dimension is an important factor that should be taken into consideration when parallelizing k-Means on GPUs. In particular, we use two different strategies for low-dimensional data sets and high-dimensional data sets respectively, in order to make the best use of the power of GPUs. For low-dimensional data sets, we exploit GPU on-chip registers to significantly decrease data access latency. For highdimensional data sets, we design a novel algorithm which simulates matrix multiplication and exploits GPU on-chip registers and also on-chip shared memory to achieve high compute-to-memory-access ratio. As a result, our GPU-based k-Means algorithm is three to eight times faster than the best reported GPU-based algorithm.
KW - Cluster
KW - CUDA
KW - GPGPU
KW - K-means
UR - http://www.scopus.com/inward/record.url?scp=78249233591&partnerID=8YFLogxK
U2 - 10.1109/CIT.2010.60
DO - 10.1109/CIT.2010.60
M3 - Conference proceeding
AN - SCOPUS:78249233591
SN - 9780769541082
T3 - Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010
SP - 115
EP - 122
BT - Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010
T2 - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, 10th IEEE Int. Conf. Scalable Computing and Communications, ScalCom-2010
Y2 - 29 June 2010 through 1 July 2010
ER -