LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration

  • Yukun Cao (Co-first author)
  • , Zengyi Gao (Co-first author)
  • , Zhiyang Li
  • , Xike Xie*
  • , S. Kevin Zhou
  • , Jianliang Xu
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

GraphRAG integrates (knowledge) graphs with large language models (LLMs) to improve reasoning accuracy and contextual relevance. Despite its promising applications and strong relevance to multiple research communities, such as databases and natural language processing, GraphRAG currently lacks modular workflow analysis, systematic solution frameworks, and insightful empirical studies. To bridge these gaps, we propose LEGO-GraphRAG, a modular framework that enables: 1) fine-grained decomposition of the GraphRAG workflow, 2) systematic classification of existing techniques and implemented GraphRAG instances, and 3) creation of new GraphRAG instances. Our framework facilitates comprehensive empirical studies of GraphRAG on large-scale real-world graphs and diverse query sets, revealing insights into balancing reasoning quality, runtime efficiency, and token or GPU cost, that are essential for building advanced GraphRAG systems.

Original languageEnglish
Pages (from-to)3269-3283
Number of pages15
JournalProceedings of the VLDB Endowment
Volume18
Issue number10
DOIs
Publication statusPublished - Jun 2025
Event51st International Conference on Very Large Data Bases, VLDB 2025 - London, United Kingdom
Duration: 1 Sept 20255 Sept 2025
https://vldb.org/2025/

Fingerprint

Dive into the research topics of 'LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration'. Together they form a unique fingerprint.

Cite this