What is the best open-source implementation of "Long-context LLMs Struggle with Long In-context Learning"?

The best maintained implementation is tiger-ai-lab/longiclbench with 114 stars on GitHub. Confidence: high. Reproducibility: Moderate.

How reproducible is "Long-context LLMs Struggle with Long In-context Learning"?

Estimated time to first reproduction: a few hours. Risk flags: No CI workflows detected. Start with tiger-ai-lab/longiclbench and validate setup instructions in README.

Are there pretrained models available for "Long-context LLMs Struggle with Long In-context Learning"?

Yes, 2 Hugging Face models found. The top result is KnutJaegersberg/2-bit-LLMs with 754 downloads.

What framework is used to implement "Long-context LLMs Struggle with Long In-context Learning"?

The primary implementation uses pytorch.

Long-context LLMs Struggle with Long In-context Learning

Published: Apr 1, 2024

Best maintained implementation now

Evidence: Direct

Domain fit: AI-adjacent

Verified repos: 2

Top repo stars: 114

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Framework: pytorch

Time to first repro: a few hours

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-2404.02060

Cache status: Stale (SWR served)

Generated at: Jun 18, 2026, 1:53 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few hours

1 risk flag

pytorch

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

Some benchmark signal exists in the extracted evidence, but it is not structured strongly enough yet for a confident benchmark decision.

Long-context LLMs Struggle with Long In-context Learning is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

tiger-ai-lab/longiclbench is the strongest maintained implementation based on ranking signals. License is declared (MIT). Dependency/environment manifests are present.

Open tiger-ai-lab/longiclbench

Reproduction Risks

No CI workflows detected

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 90/100, grounding 95/100, status high.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

tiger-ai-lab/longiclbench

best maintained

Maintenance: Stale

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 114
Last push: Feb 20, 2025 (485d ago)

Dependencies

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

hhhuang/cag

alternative

Maintenance: Stale

Confidence: Low

Reproducibility: Moderate

Community adoption signal (1496 stars)

Stars: 1,496
Last push: May 26, 2025 (390d ago)

DockerfileDependencies

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

TIGER-AI-Lab/LongICLBench

alternative

Maintenance: Stale

Confidence: Medium

Reproducibility: Moderate

Matched via arXiv identifier search · Strong overlap with paper title keywords

Stars: 114
Last push: Feb 20, 2025 (485d ago)

Dependencies

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

Best implementation now

tiger-ai-lab/longiclbench

Confidence: High

Reproducibility: Moderate

Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]

Stars: 114

Forks: 8

Last push: Feb 20, 2025

License: MIT

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (114 stars)

License ✓

CI –

Deps ✓

Docker –

Selected tiger-ai-lab/longiclbench as the strongest maintained implementation for new work.
Includes dependency/environment manifest signals.
Repository activity is within the last 24 months.

Reproduction readiness

Setup Required

Time to first repro: hours

Last checked: Jun 18, 2026

Dependencies pinned, manual setup needed

· tiger-ai-lab/longiclbench has requirements.txt but requires manual environment setup.
· Last push was 485 days ago — expect possible dependency version conflicts.
· No Dockerfile — you will set up the environment manually.
· No CI pipeline — test coverage is unknown.

Open tiger-ai-lab/longiclbench

Quick start

git clone https://github.com/tiger-ai-lab/longiclbench.git
pip install -r requirements.txt

Additional implementations

Official

No additional official repositories detected.

Community

TIGER-AI-Lab/LongICLBench
Confidence: Medium

Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]

Stars: 114

Last push: Feb 20, 2025

License: MIT