FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement

Q: What is the best open-source implementation of "FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement"?

The best maintained implementation is bingguanghao/funreason with 55 stars on GitHub. Confidence: high. Reproducibility: Limited.

Q: How reproducible is "FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement"?

Estimated time to first reproduction: a few days. Risk flags: License metadata missing, No CI workflows detected, Dependency manifest is missing. Start with bingguanghao/funreason and validate setup instructions in README.

Q: Are there pretrained models available for "FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement"?

Yes, 1 Hugging Face model found. The top result is Bingguang/FunReason with 8 downloads.

Q: What framework is used to implement "FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement"?

The primary implementation uses none.

Published: May 1, 2025

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 2

Top repo stars: 55

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: none

Time to first repro: a few days

3 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2505.20192

Cache status: Fresh

Generated at: Apr 30, 2026, 9:18 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few days

3 risk flags

none

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Some benchmark signal exists in the extracted evidence, but it is not structured strongly enough yet for a confident benchmark decision.

FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement focuses on agentic tool use.

Use This Implementation Because…

Confidence: high

bingguanghao/funreason is the strongest maintained implementation based on ranking signals.

Open bingguanghao/funreason

Reproduction Risks

License metadata missing
No CI workflows detected
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 100/100, grounding 95/100, status high.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

bingguanghao/funreason

best maintained

Maintenance: Recently updated

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 55
Last push: Nov 24, 2025 (158d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

inclusionai/aworld

alternative

Maintenance: Active

Confidence: Low

Reproducibility: Moderate

Community adoption signal (1187 stars)

Stars: 1,187
Last push: Apr 30, 2026 (1d ago)

CIReleases

Risk flags

No Docker setup
Dependency manifest missing
Low confidence match

inclusionAI/AWorld-RL

alternative

Maintenance: Active

Confidence: Low

Reproducibility: Limited

Matched via arXiv identifier search · Community adoption signal (103 stars)

Stars: 103
Last push: Apr 16, 2026 (15d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Best implementation now

bingguanghao/funreason

Confidence: High

Reproducibility: Limited

This is the official repository of the paper "BalanceSFT: Improving LLM Function Calling with Balanced Training Signals and Data Hardness"

Stars: 55

Forks: 0

Last push: Nov 24, 2025

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (55 stars)

License –

CI –

Deps –

Docker –

Selected bingguanghao/funreason as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Apr 30, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· bingguanghao/funreason has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.

Open bingguanghao/funreason

Additional implementations

Official

No additional official repositories detected.

Community

BingguangHao/BalanceSFT
Confidence: Medium

This is the official repository of the paper "BalanceSFT: Improving LLM Function Calling with Balanced Training Signals and Data Hardness"

Stars: 55

Last push: Nov 24, 2025