Recent years have seen rapid growth in data-driven distributed systems, such as Hadoop MapReduce, Spark, and Dryad. However, the counterparts for high-performance or compute-intensive applications including large-scale optimizations, modeling, and simulations are still nascent. In this paper, we introduce DtCraft, a modern C++ based distributed execution engine to streamline the development of high-performance parallel applications. Users need no understanding of distributed computing and can focus on high-level developments, leaving difficult details, such as concurrency controls, workload distribution, and fault tolerance handled by our system transparently. We have evaluated DtCraft on both micro-benchmarks and large-scale optimization problems, and shown the promising performance from single multicore machines to clusters of computers. In a particular semiconductor design problem, we achieved 30× speedup with 40 nodes and 15× less development efforts over hand-crafted implementation.
|Number of pages
|IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
|Early online date
|Published - Jun 2019