Results & Benchmarks
Benchmark data is not yet available for this paper.
Hardware Requirements
- Expect multi-day setup/compute for meaningful reproduction based on current guidance.
Best Implementation
Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs
52 2 Jul 2024 MIT
License ✓
CI –
Deps –
Docker –
- Selected dvlab-research/diaggsm8k as the strongest maintained implementation for new work.
- Repository activity is within the last 24 months.
- Official repository is preserved separately as historical context.
Reproduction Path
- 1
Start with dvlab-research/diaggsm8k and validate setup instructions in README.
- 2
Reproduce the baseline result with the provided defaults before modifying hyperparameters.
- 3
Log exact dependency versions and runtime environment for reproducibility.
Time to first repro: a few daysNo CI workflows detectedDependency manifest is missing
Additional Implementations
Official
No additional official repositories detected.
Community
- JIA-Lab-research/MR-GSM8KConfidence: low
Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs
Stars: 52Forks: 2Last push: Jul 2024License: MIT
Hugging Face Artifacts
No direct paper-linked artifacts were found. Showing strongest curated related artifacts.
Curated Related