TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Daocheng Fu, Jianlong Chen, Renqiu Xia, Zijun Chen, Qi Liu, +10 more
Abstract
Geometric problem solving (GPS) requires precise multimodal understanding and rigorous, step-by-step logical reasoning. However, developing capable Multimodal Large Language Models (MLLMs) for GPS is heavily bottlenecked by the scarcity of high-quality, verifiable data. Existing data acquisition paradigms either suffer from modality incompleteness and unverified logical gaps ("leaps-of-faith"), or rely on formal engines that generate rigid, structurally homogeneous data, failing to produce high-difficulty problems or foster genuine natural-language reasoning. To overcome these limitations, we introduce TrustGeoGen, an autonomous and formalized geometric data generation engine. TrustGeoGen strictly guarantees reasoning trustworthiness through formal verification while generating multimodally integrated data, including premises, visual diagrams, and solutions. To systematically scale problem difficulty, we incorporates difficulty-aware filtering and iterative bootstrapping mechanism. Furthermore, we propose "connection thinking" to bridge the semantic gap between rigid formal logic and fluent human-like reasoning, ensuring coherent logical transitions. We also introduce the GeoExplore family of sampling algorithms to extract diverse problem-solving trajectories based on various thinking templates. Extensive experiments demonstrate that training models on our synthesized dataset, GeoTrust, substantially enhances deep geometric reasoning capabilities and yields significant performance gains across out-of-distribution (OOD) benchmarks, including GeoQA, Geometry3K, and OlympiadBench.Our code and data can be found at https://github.com/InternScience/TrustGeoGen
Full analysis loading… Code implementations, benchmark data, and reproduction guides are being assembled. Please check back shortly.
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.