How reproducible is "GrepSeek: Training Search Agents for Direct Corpus Interaction"?

Estimated time to first reproduction: a few days. Risk flags: Adjacent implementations are not paper-verified. No maintained paper-verified implementation was found; start with the closest related repositories below.

Are there pretrained models available for "GrepSeek: Training Search Agents for Direct Corpus Interaction"?

Yes, 1 Hugging Face model found. The top result is alireza7/GrepSeek-Qwen3.5-9B-GRPO with 148 downloads.

GrepSeek: Training Search Agents for Direct Corpus Interaction

Alireza Salemi, Chang Zeng, Atharva Nijasure, Jui-Hui Chung, Razieh Rahimi, Fernando Diaz, Hamed Zamani

Published: May 28, 2026

No direct paper-linked artifacts found; showing strongest related artifacts

Evidence: Curated Related

Domain fit: AI-core

Verified repos: 0

Core AI workload signals detected from paper context and implementation/artifact evidence.

Time to first repro: a few days

1 risk flag

arXiv PDF

Large Language Model (LLM) search agents have shown strong promise for knowledge-intensive language tasks through multiple rounds of reasoning and information retrieval. Most existing systems access information using a retriever that takes a keyword or natural language query and returns a ranked list of documents using an index of pre-computed document representations. In this work, we explore a complementary perspec ...

Read full abstract

tive in which the search agent treats the corpus itself as the search environment and finds evidence by issuing executable shell commands. We introduce GrepSeek, an optimized direct corpus interaction (DCI) search agent that trains a compact search agent to find, filter, and compose evidence from large text corpora. To address the instability of learning behavior directly with reinforcement learning on large corpora, we propose a two-stage training pipeline. First, we construct a cold-start dataset using an answer-aware Tutor and answer-blind Planner to generate verified, causally grounded search trajectories. Second, we refine the initialized policy with Group Relative Policy Optimization (GRPO), allowing the agent to improve its task-oriented search behavior through direct interaction with the corpus. To make DCI practical at scale, we further use a semantics-preserving sharded-parallel execution engine that accelerates shell-based retrieval by up to $7.6\times$ while preserving byte-exact equivalence with sequential execution of the shell command. Experiments across seven open-domain question answering benchmarks show that GrepSeek achieves the strongest overall token-level $F_1$ and Exact Match. Our analysis also highlights the limitations of purely lexical interaction on queries with substantial surface-form variation, suggesting DCI as a practical and competitive method for search agents that can complement existing retrieval paradigms in the real world.

Technical details

Canonical key: arxiv-2605.29307

Cache status: Fresh

Generated at: Jun 19, 2026, 9:14 PM

Artifact coverage: curated_related

HF provider: ok (token)

PWC source used: No

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

context only

Benchmarks: thin evidence

Time to repro: a few days

1 risk flag

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Some benchmark signal exists in the extracted evidence, but it is not structured strongly enough yet for a confident benchmark decision.

Large Language Model (LLM) search agents have shown strong promise for knowledge-intensive language tasks through multiple rounds of reasoning and information retrieval.

Implementation Evidence Summary

Confidence: medium

ventr1c/Awesome-RL-based-Agentic-Search-Papers is the closest maintained adjacent implementation (Matches contextual method/domain keyword: reinforcement learning). It is not paper-verified; validate algorithm and evaluation setup against the paper before trusting reported metrics. Community adoption signal: 263 GitHub stars.

Reproduction Risks

Adjacent implementations are not paper-verified
Recommended repository is adjacent and not paper-verified.

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 100/100, grounding 95/100, status high.

Implementation Comparison

Top 1 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

alirezasalemi7/grepseek

alternative

Maintenance: Active

Confidence: Low

Reproducibility: Limited

Strong overlap with paper title keywords · Community adoption signal (44 stars)

Stars: 44
Last push: May 29, 2026 (22d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Implementation Status

No verified maintained repo

There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.

No maintained paper-verified implementation was found; start with the closest related repositories below.
Compare repo methods against the paper equations/algorithm before trusting metrics.
Create a minimal baseline implementation from the paper and use adjacent repos as references.

Time to first repro: a few days

Best available artifact: alireza7/GrepSeek-Qwen3.5-9B-GRPO

Reproduction readiness

No Repo

Time to first repro: days

Last checked: Jun 19, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No verified implementation available

· No maintained repository has been identified for this paper. Check adjacent implementations or HF artifacts below.

Closest related implementations

These are not paper-verified. Use them as reference points when no direct implementation is available.

ventr1c/Awesome-RL-based-Agentic-Search-Papers

Adjacent

Confidence: Medium

Stars: 263

Matches contextual method/domain keyword: reinforcement learning