TY - GEN
T1 - Catapult
T2 - 2019 International Conference on Management of Data, SIGMOD 2019
AU - Huang, Kai
AU - Chua, Huey Eng
AU - Bhowmick, Sourav S.
AU - CHOI, Koon Kau
AU - Zhou, Shuigeng
N1 - Funding Information:
K. Huang, S. S. Bhowmick and H. E. Chua are supported by the AcRF Tier-2 Grant MOE2015-T2-1-040. B. Choi is supported by HK-RGC GRF 12201315 and 12232716. S. Zhou is supported by National Natural Science Foundation of China (NSFC) under grant No. U1636205.
PY - 2019/6/25
Y1 - 2019/6/25
N2 - Visual graph query interfaces (a.k.a gui) widen the reach of graph querying frameworks across different users by enabling non-programmers to use them. Consequently, several commercial and academic frameworks for querying a large collection of small- or medium-sized data graphs (e.g., chemical compounds) provide such visual interfaces. Majority of these interfaces expose a fixed set of canned patterns (i.e., small subgraph patterns) to expedite query formulation by enabling pattern-at-a-time in lieu of edge-at-a-time construction mode. Canned patterns to be displayed on a gui are typically selected manually based on domain knowledge. However, manual generation of canned patterns is labour intensive. Furthermore, these patterns may not sufficiently cover the underlying data graphs to expedite visual formulation of a wide range of subgraph queries. In this paper, we present a generic and extensible framework called Catapult to address these limitations. Catapult takes a data-driven approach to automatically select canned patterns, thereby taking a concrete step towards the vision of data-driven construction of visual query interfaces. Specifically, it first clusters the underlying data graphs based on their topological similarities and then summarize each cluster to create a cluster summary graph (csg). The canned patterns within a user-specified pattern budget are then generated from these csgs by maximizing coverage and diversity, and minimizing cognitive load of the patterns. Experimental study with real-world datasets and visual graph interfaces demonstrates the superiority of Catapult compared to traditional techniques.
AB - Visual graph query interfaces (a.k.a gui) widen the reach of graph querying frameworks across different users by enabling non-programmers to use them. Consequently, several commercial and academic frameworks for querying a large collection of small- or medium-sized data graphs (e.g., chemical compounds) provide such visual interfaces. Majority of these interfaces expose a fixed set of canned patterns (i.e., small subgraph patterns) to expedite query formulation by enabling pattern-at-a-time in lieu of edge-at-a-time construction mode. Canned patterns to be displayed on a gui are typically selected manually based on domain knowledge. However, manual generation of canned patterns is labour intensive. Furthermore, these patterns may not sufficiently cover the underlying data graphs to expedite visual formulation of a wide range of subgraph queries. In this paper, we present a generic and extensible framework called Catapult to address these limitations. Catapult takes a data-driven approach to automatically select canned patterns, thereby taking a concrete step towards the vision of data-driven construction of visual query interfaces. Specifically, it first clusters the underlying data graphs based on their topological similarities and then summarize each cluster to create a cluster summary graph (csg). The canned patterns within a user-specified pattern budget are then generated from these csgs by maximizing coverage and diversity, and minimizing cognitive load of the patterns. Experimental study with real-world datasets and visual graph interfaces demonstrates the superiority of Catapult compared to traditional techniques.
UR - http://www.scopus.com/inward/record.url?scp=85069489214&partnerID=8YFLogxK
U2 - 10.1145/3299869.3300072
DO - 10.1145/3299869.3300072
M3 - Conference proceeding
AN - SCOPUS:85069489214
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 900
EP - 917
BT - SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PB - Association for Computing Machinery (ACM)
Y2 - 30 June 2019 through 5 July 2019
ER -