Heterogeneous information networks (HINs), which are typed graphs with labeled nodes and edges, have attracted tremendous interest from academia and industry. Given two HIN nodes $s$s and $t$t, and a natural number $k$k, we study the discovery of the $k$k most important meta paths in real time, which can be used to support friend search, product recommendation, anomaly detection, and graph clustering. In this work, we argue that the shortest path between $s$s and $t$t may not necessarily be the most important path. As such, we combine several ranking functions, which are based on frequency and rarity, to redefine the unified importance function of the meta paths between $s$s and $t$t. Although this importance function can capture more information, it is very time-consuming to find top-$k$k meta paths using this importance function. Therefore, we integrate this importance function into a multi-step framework, which can efficiently filter some impossible meta paths between $s$s and $t$t. In addition, we combine bidirectional searching algorithm with this framework to further boost the efficiency performance. The experiment on different datasets shows that our proposed method outperforms state-of-the-art algorithms in terms of effectiveness with reasonable response time.
|Number of pages||14|
|Journal||IEEE Transactions on Knowledge and Data Engineering|
|Early online date||10 Nov 2020|
|Publication status||Published - 1 Sept 2022|
- Heterogeneous information networks
- meta path