Skip to content

SocraticKG: Knowledge Graph Construction via QA-Driven Fact Extraction

Sanghyeok Choi, Woosang Jeon, Kyuseok Yang, Taehyeong Kim

2026-01-15T02:26:51Z

Abstract

Constructing Knowledge Graphs (KGs) from unstructured text provides a structured framework for knowledge representation and reasoning, yet current LLM-based approaches struggle with a fundamental trade-off: factual coverage often leads to relational fragmentation, while premature consolidation causes information loss. To address this, we propose SocraticKG, an automated KG construction method that introduces question-answer pairs as a structured intermediate representation to systematically unfold document-level semantics prior to triple extraction. By employing 5W1H-guided QA expansion, SocraticKG captures contextual dependencies and implicit relational links typically lost in direct KG extraction pipelines, providing explicit grounding in the source document that helps mitigate implicit reasoning errors. Evaluation on the MINE benchmark and HotpotQA downstream task demonstrates that our approach effectively addresses the coverage-connectivity trade-off, achieving superior factual retention and structural cohesion while supporting complex multi-hop reasoning.

Full analysis loading… Code implementations, benchmark data, and reproduction guides are being assembled. Please check back shortly.

Browse all papers

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.