Official implementation from Papers with Code · Repository link is mentioned in the paper metadata
- Stars
- 393
- Last push
- Mar 4, 2026 (45d ago)
Risk flags
- No CI pipeline detected
- No tagged releases
- No Docker setup
Zhishang Xiang, Chuanjie Wu, Qinggang Zhang, Shengyuan Chen, Zijin Hong, Xiao Huang, Jinsong Su
Core AI workload signals detected from paper context and implementation/artifact evidence.
Graph retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing large language models (LLMs) with external knowledge. It leverages graphs to model the hierarchical structure between specific concepts, enabling more coherent and effective knowledge retrieval for accurate reasoning.Despite its conceptual promise, recent studies report that GraphRAG frequently underperforms vanilla RAG o ...
n many real-world tasks. This raises a critical question: Is GraphRAG really effective, and in which scenarios do graph structures provide measurable benefits for RAG systems? To address this, we propose GraphRAG-Bench, a comprehensive benchmark designed to evaluate GraphRAG models onboth hierarchical knowledge retrieval and deep contextual reasoning. GraphRAG-Bench features a comprehensive dataset with tasks of increasing difficulty, coveringfact retrieval, complex reasoning, contextual summarization, and creative generation, and a systematic evaluation across the entire pipeline, from graph constructionand knowledge retrieval to final generation. Leveraging this novel benchmark, we systematically investigate the conditions when GraphRAG surpasses traditional RAG and the underlying reasons for its success, offering guidelines for its practical application. All related resources and analyses are collected for the community at https://github.com/GraphRAG-Bench/GraphRAG-Benchmark.
Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.
| Task | Dataset | Metric | Value | Source | Evidence refs |
|---|---|---|---|---|---|
| Generation | Medical | Accuracy | 65.65 | paper-derived | No explicit refs |
Graph retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing large language models (LLMs) with external knowledge.
graphrag-bench/graphrag-benchmark is the strongest maintained implementation based on ranking signals. License is declared (MIT). Dependency/environment manifests are present.
Open graphrag-bench/graphrag-benchmarkEvidence graph: 4 refs, 4 links.
Utility signals: depth 90/100, grounding 95/100, status high.
Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.
Official implementation from Papers with Code · Repository link is mentioned in the paper metadata
Risk flags
Matched via arXiv identifier search · Strong overlap with paper title keywords
Risk flags
Matched via arXiv identifier search
Risk flags
The official repo of GraphRAG-Bench for evaluating GraphRAG models. "When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation". (ICLR'26)
Dependencies pinned, manual setup needed
Quick start
git clone https://github.com/graphrag-bench/graphrag-benchmark.git
pip install -r requirements.txt No additional official repositories detected.
The official repo of GraphRAG-Bench for evaluating GraphRAG models. "When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation". (ICLR'26)
These repositories had low-confidence matching signals and are hidden by default.
No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.
Broaden model search
No trustworthy dataset matches right now.
Search datasets on Hugging FaceNo trustworthy demo spaces right now.
Search spaces on Hugging FaceTasks
Retrieval / indexing
Methods
Transformer, Retrieval-augmented generation
Domains
Natural Language Processing, Information Retrieval
Evaluation & Human Feedback Data
Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.
Open in HFEPXExplore Similar Papers
Jump to Paper2Code search queries derived from this paper's research context.
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.