MyBenchmark: generating databases for query workloads

Eric Lo*, Nick Cheng, Wilfred W.K. Lin, Wing Kai Hon, Koon Kau CHOI

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

19 Citations (Scopus)


To evaluate the performance of database applications and database management systems (DBMSs), we usually execute workloads of queries on generated databases of different sizes and then benchmark various measures such as respond time and throughput. This paper introduces MyBenchmark, a parallel data generation tool that takes a set of queries as input and generates database instances. Users of MyBenchmark can control the characteristics of the generated data as well as the characteristics of the resulting workload. Applications of MyBenchmark include DBMS testing, database application testing, and application-driven benchmarking. In this paper, we present the architecture and the implementation algorithms of MyBenchmark. Experimental results show that MyBenchmark is able to generate workload-aware databases for a variety of workloads including query workloads extracted from TPC-C, TPC-E, TPC-H, and TPC-W benchmarks.

Original languageEnglish
Pages (from-to)895-913
Number of pages19
JournalVLDB Journal
Issue number6
Publication statusPublished - 15 Nov 2014

Scopus Subject Areas

  • Information Systems
  • Hardware and Architecture

User-Defined Keywords

  • Benchmarking
  • Data Generation
  • Performance
  • Query Processing


Dive into the research topics of 'MyBenchmark: generating databases for query workloads'. Together they form a unique fingerprint.

Cite this