Guardrails Beat Guidance: A Large-Scale Study of Rules, Skills, and Persistent Configuration for Coding Agents

Q: What is the best open-source implementation of "Guardrails Beat Guidance: A Large-Scale Study of Rules, Skills, and Persistent Configuration for Coding Agents"?

The best maintained implementation is yzhao062/agent-style with 522 stars on GitHub. Confidence: medium. Reproducibility: Limited.

Q: How reproducible is "Guardrails Beat Guidance: A Large-Scale Study of Rules, Skills, and Persistent Configuration for Coding Agents"?

Estimated time to first reproduction: a few days. Risk flags: License metadata missing, Dependency manifest is missing. Start with yzhao062/agent-style and validate setup instructions in README.

Xing Zhang, Guanghui Wang, Yanwei Cui, Wei Qiu, Ziyuan Li, Bing Zhu, Peiyang He

Published: Apr 13, 2026

Best maintained implementation now

Evidence: Direct

Domain fit: AI-adjacent

Verified repos: 1

Top repo stars: 522

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Time to first repro: a few days

2 risk flags

arXiv PDF

Random rules improve a coding agent's task performance as much as expert-curated ones (both $+13.8$pp on a discriminative subset of SWE-bench Verified), and in our data every individually beneficial rule is a negative constraint ("do not refactor unrelated code"), while every individually harmful one is a positive directive ("follow code style"). We arrive at these findings through the first large-scale controlled st ...

Read full abstract

udy of agent rule files (\texttt{CLAUDE.md}, \texttt{.cursorrules}, and the broader family of agent skills, plugin manifests, and persona definitions): we scrape 679 rule files (25{,}532 rules) from GitHub and conduct over 5{,}000 agent runs of Claude Code with Claude Opus 4.6 on SWE-bench Verified. Three patterns emerge. (i) Rule polarity cleanly separates beneficial from harmful rules; we read this through the lens of potential-based reward shaping (PBRS). (ii) Performance gains are largely content-independent: random, shuffled, mismatched-domain, and unconverted-format rule files all match curated rules, pointing to a context priming mechanism. (iii) Individual rules often appear harmful in isolation yet do not visibly accumulate damage in ensemble: pass rates remain stable across rule counts from 0 to 50. These findings expose a hidden reliability risk in the rapidly growing ecosystem of community-authored rules and skills, and they yield a clear principle for safer agent configuration: constrain what agents must not do, rather than prescribing what they should.

Technical details

Canonical key: arxiv-2604.11088

Cache status: Stale (SWR served)

Generated at: Jun 22, 2026, 4:37 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: No

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few days

2 risk flags

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Use This Implementation Because…

Confidence: medium

yzhao062/agent-style is the best available implementation candidate based on ranking signals, but recommendation confidence is not yet high. CI workflows are present.

Open yzhao062/agent-style

Reproduction Risks

License metadata missing
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 65/100, grounding 75/100, status medium.

Implementation Comparison

Top 1 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

yzhao062/agent-style

best maintained

Maintenance: Active

Confidence: Medium

Reproducibility: Limited

Matched via arXiv identifier search · Strong overlap with paper title keywords

Stars: 522
Last push: Jun 13, 2026 (11d ago)

CI Releases

Risk flags

No Docker setup
Dependency manifest missing

Best implementation now

yzhao062/agent-style

Confidence: Medium

Reproducibility: Limited

21 writing rules for AI coding and writing agents. Drop-in for Claude Code, Codex, Copilot, Cursor, and Aider, so their output reads like a tech pro.

Stars: 522

Forks: 29

Last push: Jun 13, 2026

Matched via arXiv identifier search

Strong overlap with paper title keywords

Community adoption signal (522 stars)

License –

CI ✓

Deps –

Docker –

Selected yzhao062/agent-style as the strongest maintained implementation for new work.
Includes CI workflow signals.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Jun 22, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· yzhao062/agent-style has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.

Open yzhao062/agent-style

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.