What is the best open-source implementation of "Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark"?

The best maintained implementation is fairyshine/seal-tools with 54 stars on GitHub. Confidence: high. Reproducibility: Limited.

Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark

Q: How reproducible is "Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark"?

Estimated time to first reproduction: a few days. Risk flags: No CI workflows detected, Dependency manifest is missing. Start with fairyshine/seal-tools and validate setup instructions in README.

Q: What framework is used to implement "Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark"?

The primary implementation uses none.

Published: May 1, 2024

Best maintained implementation now

Evidence: Direct

Domain fit: AI-adjacent

Verified repos: 2

Top repo stars: 54

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Framework: none

Time to first repro: a few days

2 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2405.08355

Cache status: Fresh

Generated at: May 2, 2026, 5:32 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few days

2 risk flags

none

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Agentic tool use

ChatGPT (gpt-3.5-turbo-0613)

Format ACC

96.16

Source: paper fulltext

Agentic tool use

78.74

Format ACC

68.63

Source: paper fulltext

Agentic tool use

GPT4 (gpt-4-0613)

Format ACC

97.12

Source: paper fulltext

Agentic tool use

81.65

Format ACC

80.52

Source: paper fulltext

Agentic tool use

89.27

Format ACC

72.37

Source: paper fulltext

Benchmark evidence drill-down

5 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Agentic tool use	ChatGPT (gpt-3.5-turbo-0613)	Format ACC	96.16	paper-derived	No explicit refs
Agentic tool use	78.74	Format ACC	68.63	paper-derived	No explicit refs
Agentic tool use	GPT4 (gpt-4-0613)	Format ACC	97.12	paper-derived	No explicit refs
Agentic tool use	81.65	Format ACC	80.52	paper-derived	No explicit refs
Agentic tool use	89.27	Format ACC	72.37	paper-derived	No explicit refs

Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark focuses on agentic tool use.

Use This Implementation Because…

Confidence: high

fairyshine/seal-tools is the strongest maintained implementation based on ranking signals. License is declared (Apache-2.0).

Open fairyshine/seal-tools

Reproduction Risks

No CI workflows detected
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 100/100, grounding 85/100, status high.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

fairyshine/seal-tools

best maintained

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 54
Last push: Nov 5, 2024 (544d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

MadeAgents/Hammer

alternative

Maintenance: Stale risk

Confidence: Low

Reproducibility: Moderate

Community adoption signal (115 stars)

Stars: 115
Last push: Jun 13, 2025 (324d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

fairyshine/Seal-Tools

alternative

Maintenance: Stale

Confidence: Medium

Reproducibility: Limited

Matched via arXiv identifier search · Strong overlap with paper title keywords

Stars: 54
Last push: Nov 5, 2024 (544d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

Best implementation now

fairyshine/seal-tools

Confidence: High

Reproducibility: Limited

The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark.

Stars: 54

Forks: 5

Last push: Nov 5, 2024

License: Apache-2.0

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (54 stars)

License ✓

CI –

Deps –

Docker –

Selected fairyshine/seal-tools as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: May 2, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· fairyshine/seal-tools has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.
· Last push was 544 days ago.

Open fairyshine/seal-tools

Additional implementations

Official

No additional official repositories detected.

Community

fairyshine/Seal-Tools
Confidence: Medium

The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark.

Stars: 54

Last push: Nov 5, 2024

License: Apache-2.0