What is the best open-source implementation of "ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates"?

The best maintained implementation is gen-verse/reasonflux with 533 stars on GitHub. Confidence: high. Reproducibility: Limited.

Are there pretrained models available for "ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates"?

Yes, 1 Hugging Face model found. The top result is Gen-Verse/ReasonFlux-F1 with 275 downloads.

What framework is used to implement "ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates"?

The primary implementation uses pytorch.

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Q: How reproducible is "ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates"?

Estimated time to first reproduction: a few days. Risk flags: License metadata missing, No CI workflows detected, Dependency manifest is missing. Start with gen-verse/reasonflux and validate setup instructions in README.

Published: Feb 1, 2025

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 533

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: pytorch

Time to first repro: a few days

3 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2502.06772

Cache status: Fresh

Generated at: May 1, 2026, 5:54 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few days

3 risk flags

pytorch

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Natural language processing

MATH

pass@1

Source: paper fulltext

Benchmark evidence drill-down

1 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Natural language processing	MATH	pass@1	1	paper-derived	No explicit refs

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

gen-verse/reasonflux is the strongest maintained implementation based on ranking signals.

Open gen-verse/reasonflux

Reproduction Risks

License metadata missing
No CI workflows detected
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 100/100, grounding 95/100, status high.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

gen-verse/reasonflux

best maintained

Maintenance: Stale risk

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 533
Last push: Sep 27, 2025 (216d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

yangling0818/buffer-of-thought-llm

alternative

Maintenance: Stale risk

Confidence: Low

Reproducibility: Moderate

Strong overlap with paper title keywords · Community adoption signal (676 stars)

Stars: 676
Last push: Jun 28, 2025 (307d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

yangling0818/supercorrect-llm

alternative

Maintenance: Stale

Confidence: Low

Reproducibility: Limited

Strong overlap with paper title keywords · Community adoption signal (89 stars)

Stars: 89
Last push: Mar 23, 2025 (404d ago)

Dependencies

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

Best implementation now

gen-verse/reasonflux

Confidence: High

Reproducibility: Limited

[NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux, ReasonFlux-PRM, and ReasonFlux-Coder.

Stars: 533

Forks: 37

Last push: Sep 27, 2025

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Partial overlap with paper title keywords

Community adoption signal (533 stars)

License –

CI –

Deps –

Docker –

Selected gen-verse/reasonflux as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: May 1, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· gen-verse/reasonflux has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.
· Last push was 216 days ago.

Open gen-verse/reasonflux