An effective GPU implementation of breadth-first search

Lijuan Luo, Martin Wong, Wen-mei Hwu

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

178 Citations (Scopus)


Breadth-first search (BFS) has wide applications in electronic design automation (EDA) as well as in other fields. Researchers have tried to accelerate BFS on the GPU, but the two published works are both asymptotically slower than the fastest CPU implementation. In this paper, we present a new GPU implementation of BFS that uses a hierarchical queue management technique and a three-layer kernel arrangement strategy. It guarantees the same computational complexity as the fastest sequential version and can achieve up to 10 times speedup.

Original languageEnglish
Title of host publication47th ACM/IEEE Design Automation Conference - Proceedings 2010
PublisherAssociation for Computing Machinery (ACM)
Number of pages4
ISBN (Electronic)9781450300025
ISBN (Print)9781424466771
Publication statusPublished - 15 Jun 2010
Event47th ACM/IEEE Design Automation Conference, DAC 2010 - Anaheim, United States
Duration: 13 Jun 201018 Jun 2010 (Conference website) (Conference programme) (Conference proceedings) (Conference proceedings)

Publication series

NameACM/IEEE Design Automation Conference - Proceedings
ISSN (Print)0738-100X


Conference47th ACM/IEEE Design Automation Conference, DAC 2010
Country/TerritoryUnited States
Internet address

Scopus Subject Areas

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modelling and Simulation

User-Defined Keywords

  • CUDA
  • GPU computing
  • BFS


Dive into the research topics of 'An effective GPU implementation of breadth-first search'. Together they form a unique fingerprint.

Cite this