TY - JOUR
T1 - LEGO-GraphRAG
T2 - 51st International Conference on Very Large Data Bases, VLDB 2025
AU - Cao, Yukun
AU - Gao, Zengyi
AU - Li, Zhiyang
AU - Xie, Xike
AU - Zhou, S. Kevin
AU - Xu, Jianliang
N1 - This work was supported in part by the National Natural Science Foundation of China under Grant 62472400, Grant 62072428, Grant 62271465, in part by the Suzhou Basic Research Program under Grant SYG202338, and in part by the HK RGC grants 12202024, R1015-23, and C1043-24GF. Yukun Cao and Zengyi Gao contributed equally to this work. Xike Xie is the corresponding author.
Publisher Copyright:
© 2025, VLDB Endowment. All rights reserved.
PY - 2025/6
Y1 - 2025/6
N2 - GraphRAG integrates (knowledge) graphs with large language models (LLMs) to improve reasoning accuracy and contextual relevance. Despite its promising applications and strong relevance to multiple research communities, such as databases and natural language processing, GraphRAG currently lacks modular workflow analysis, systematic solution frameworks, and insightful empirical studies. To bridge these gaps, we propose LEGO-GraphRAG, a modular framework that enables: 1) fine-grained decomposition of the GraphRAG workflow, 2) systematic classification of existing techniques and implemented GraphRAG instances, and 3) creation of new GraphRAG instances. Our framework facilitates comprehensive empirical studies of GraphRAG on large-scale real-world graphs and diverse query sets, revealing insights into balancing reasoning quality, runtime efficiency, and token or GPU cost, that are essential for building advanced GraphRAG systems.
AB - GraphRAG integrates (knowledge) graphs with large language models (LLMs) to improve reasoning accuracy and contextual relevance. Despite its promising applications and strong relevance to multiple research communities, such as databases and natural language processing, GraphRAG currently lacks modular workflow analysis, systematic solution frameworks, and insightful empirical studies. To bridge these gaps, we propose LEGO-GraphRAG, a modular framework that enables: 1) fine-grained decomposition of the GraphRAG workflow, 2) systematic classification of existing techniques and implemented GraphRAG instances, and 3) creation of new GraphRAG instances. Our framework facilitates comprehensive empirical studies of GraphRAG on large-scale real-world graphs and diverse query sets, revealing insights into balancing reasoning quality, runtime efficiency, and token or GPU cost, that are essential for building advanced GraphRAG systems.
UR - https://www.vldb.org/pvldb/volumes/18/paper/LEGO-GraphRAG%3A%20Modularizing%20Graph-based%20Retrieval-Augmented%20Generation%20for%20Design%20Space%20Exploration
UR - https://www.scopus.com/pages/publications/105021403178
U2 - 10.14778/3748191.3748194
DO - 10.14778/3748191.3748194
M3 - Conference article
AN - SCOPUS:105021403178
SN - 2150-8097
VL - 18
SP - 3269
EP - 3283
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 10
Y2 - 1 September 2025 through 5 September 2025
ER -