TY - JOUR
T1 - Cpp-Taskflow
T2 - A General-Purpose Parallel Task Programming System at Scale
AU - Huang, Tsung-Wei
AU - Lin, Yibo
AU - Lin, Chun-Xun
AU - Guo, Guannan
AU - Wong, Martin D. F.
N1 - This work was supported in part by NSF under Grant CCF-1718883, and in part by DARPA under Grant FA-650-18-2-7843. Preliminary version of this paper has been presented at the 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS’19), Rio de Janeiro, Brazil, May 2019 [1]. This article was recommended by Associate Editor S. Held. (Corresponding author: Tsung-Wei Huang.) Tsung-Wei Huang is with the Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, UT 84112 USA (e-mail: [email protected]).
PY - 2021/8
Y1 - 2021/8
N2 - This article introduces Cpp-Taskflow, a high-performance parallel task programming system, to streamline the building of large and complex parallel applications. Cpp-Taskflow leverages the power of modern C++ and task-based approaches to enable efficient implementations of parallel decomposition strategies. Our programming model can quickly handle not only traditional loop-level parallelism but also irregular patterns, such as graph algorithms and dynamic control flows. Compared with existing libraries, Cpp-Taskflow is more cost efficient in performance scaling and software integration. We have evaluated Cpp-Taskflow on both micro-benchmarks and large-scale design automation problems of million-scale tasking. In a particular timing analysis workload, Cpp-Taskflow outperformed OpenMP by 2× faster using 2× fewer lines of code. We have also shown Cpp-Taskflow achieved up to 47.81% speed-up with 28.5% less code over the industrial-strength library, Intel Threading Building Blocks, on a detailed placement problem.
AB - This article introduces Cpp-Taskflow, a high-performance parallel task programming system, to streamline the building of large and complex parallel applications. Cpp-Taskflow leverages the power of modern C++ and task-based approaches to enable efficient implementations of parallel decomposition strategies. Our programming model can quickly handle not only traditional loop-level parallelism but also irregular patterns, such as graph algorithms and dynamic control flows. Compared with existing libraries, Cpp-Taskflow is more cost efficient in performance scaling and software integration. We have evaluated Cpp-Taskflow on both micro-benchmarks and large-scale design automation problems of million-scale tasking. In a particular timing analysis workload, Cpp-Taskflow outperformed OpenMP by 2× faster using 2× fewer lines of code. We have also shown Cpp-Taskflow achieved up to 47.81% speed-up with 28.5% less code over the industrial-strength library, Intel Threading Building Blocks, on a detailed placement problem.
KW - Computer-aided design (CAD)
KW - parallel programming systems
KW - task parallelism
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85091691543&origin=resultslist&sort=plf-f&src=s&sid=b47dc2ca5688950a5beec243e89c2e8f&sot=b&sdt=b&s=DOI%2810.1109%2Ftcad.2020.3025075%29&sl=40&sessionSearchId=b47dc2ca5688950a5beec243e89c2e8f
U2 - 10.1109/TCAD.2020.3025075
DO - 10.1109/TCAD.2020.3025075
M3 - Journal article
SN - 0278-0070
VL - 40
SP - 1687
EP - 1700
JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IS - 8
ER -