## Project Details

### Description

Blockchains and smart contracts have enabled various platforms of cryptocurrencies, e.g., [CYC+19, WLC21], and have gained unprecedented attention from a wide audience. The opportunities of cryptocurrencies come with pressing challenges, and the most notable one is the money laundering problem [CZF04,CR17,HST+19,MK11,MBB13,WDC+19,YLW+19]. The police force may have some suspicions on various kind of direct or indirect transfers of large funds between two groups of users. This proposal proposes a query processing approach to address the efficient identification of such transfers in cryptocurrencies.

Transactions of cryptocurrencies can be readily modeled through transaction networks. Network data (a.k.a graph data) has appeared in a variety of applications, including blockchain networks [WLC21] and data from e-payment systems [CYC+19]. While there have been some data mining

analyses on understanding transaction networks, there are two significant gaps in finding suspicious flows from the networks. First, the flows of our interests are related to a targeted group of users but not the entire network. Second, the maximum flow problems are long-standing

fundamental graph problems whose solutions have high time complexities in terms of the network size and hence, cannot be directly adopted. An efficient maximum flow algorithm on large networks is not yet known.

In the first task of this project, we formalize and investigate flows in transaction networks. First, the flows studied in this proposal are different from classical ones. In particular, funds must be transferred from earlier transactions to later transactions. Moreover, users may use multiple

addresses to transfer and receive their funds. The second task is to define a new query, called 𝑘 ST densest flow (kSTFlow), and to analyze its hardness. Specifically, the input of kSTFlow is a transaction graph G, two subsets of nodes S and T of G, a size k, a normalization factor , a time

window , and the answer of kSTFlow is the subsets of S and T, S’ and T’, such that the flow from S’ to T’ is densest and the total size of S’ and T’ is larger than or equal to k, during . We plan to further define the variations of kSTFlow. Third, as kSTFlow is new, we plan to study a series of

novel techniques for its performance optimization. It includes query decomposition, graph compression, and approximation algorithms. Finally, we study the performance of our solutions for kSTFlow through an experimental study and a case study using real-world datasets. In the long

term, this research introduces kSTFlow as a new primitive of graph databases and releases the addresses found from kSTFlow.

Transactions of cryptocurrencies can be readily modeled through transaction networks. Network data (a.k.a graph data) has appeared in a variety of applications, including blockchain networks [WLC21] and data from e-payment systems [CYC+19]. While there have been some data mining

analyses on understanding transaction networks, there are two significant gaps in finding suspicious flows from the networks. First, the flows of our interests are related to a targeted group of users but not the entire network. Second, the maximum flow problems are long-standing

fundamental graph problems whose solutions have high time complexities in terms of the network size and hence, cannot be directly adopted. An efficient maximum flow algorithm on large networks is not yet known.

In the first task of this project, we formalize and investigate flows in transaction networks. First, the flows studied in this proposal are different from classical ones. In particular, funds must be transferred from earlier transactions to later transactions. Moreover, users may use multiple

addresses to transfer and receive their funds. The second task is to define a new query, called 𝑘 ST densest flow (kSTFlow), and to analyze its hardness. Specifically, the input of kSTFlow is a transaction graph G, two subsets of nodes S and T of G, a size k, a normalization factor , a time

window , and the answer of kSTFlow is the subsets of S and T, S’ and T’, such that the flow from S’ to T’ is densest and the total size of S’ and T’ is larger than or equal to k, during . We plan to further define the variations of kSTFlow. Third, as kSTFlow is new, we plan to study a series of

novel techniques for its performance optimization. It includes query decomposition, graph compression, and approximation algorithms. Finally, we study the performance of our solutions for kSTFlow through an experimental study and a case study using real-world datasets. In the long

term, this research introduces kSTFlow as a new primitive of graph databases and releases the addresses found from kSTFlow.

Status | Not started |
---|---|

Effective start/end date | 1/01/24 → 31/12/26 |

## Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.