When Learned Index Meets Blockchain: Design, Algorithms, and Performance Evaluation

Project: Research project

Project Details

Description

Blockchain has become increasingly widely adopted for many decentralized applications in recent years. It enables otherwise untrusted peers to collectively maintain a verifiable distributed database. However, to preserve the integrity and support data provenance, every blockchain node must store and maintain entire ledger states, which incurs high storage costs and prolongs data search time.

This project aims to explore emerging learned index technologies to optimize blockchain system performance. Recent studies on learned indexes have shown their advantage over traditional indexes such as B+-tree and R-tree. At its core, a learned index replaces the directing keys in each index node with a learned model to reduce storage overhead and improve search efficiency. However, we cannot apply the existing learned indexes directly to blockchain systems because of the following challenges. First, the existing learned indexes do not support data authentication and provenance, which are essential to blockchain systems. Second, blockchain uses long hash strings as indexing keys, which are different from the numerical keys used by the existing learned indexes. Third, the existing learned indexes focus on read-optimized, in-memory databases, whereas blockchain systems feature frequent state updates and disk-based storage.

To address these challenges, in this project we propose a novel column-based Merkle learned index for blockchain systems. More specifically, we plan to design (1) a two-level column-based Merkle index for supporting efficient data authentication and provenance; (2) two specifically-designed structures and learned models that are tailored for the two levels of indexes; and (3) a multivariate linear regression model to learn the distribution of hash string keys. Besides formulating the basic design, we also plan to investigate efficient algorithms for search within and maintenance of the proposed learned index as well as to develop several optimization techniques that strike a balance between storage and search performance. Finally, we will develop a proof-of-concept prototype system and evaluate its performance with real-world blockchain workloads to assess the practicality of the proposed solutions.

With our rich research experience in blockchain data management and query processing, we expect the outcome of this project to accelerate the growth and adoption of blockchain technologies and decentralized services in the pertinent industries.
StatusActive
Effective start/end date1/01/23 → …

UN Sustainable Development Goals

In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):

  • SDG 9 - Industry, Innovation, and Infrastructure

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.