What framework is used to implement "A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets"?

The primary implementation uses pytorch.

A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets

Q: How reproducible is "A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets"?

Estimated time to first reproduction: a few days. Risk flags: Adjacent implementations are not paper-verified. No maintained paper-verified implementation was found; start with the closest related repositories below.

Jake Grigsby, Yanjun Qi

Published: Oct 10, 2021

Historical official implementation (not recommended for new builds)

Evidence: Historical

Domain fit: AI-adjacent

Verified repos: 3

Top repo stars: 105

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Framework: pytorch

Time to first repro: a few days

1 risk flag

arXiv PDF

Recent Offline Reinforcement Learning methods have succeeded in learning high-performance policies from fixed datasets of experience. A particularly effective approach learns to first identify and then mimic optimal decision-making strategies. Our work evaluates this method's ability to scale to vast datasets consisting almost entirely of sub-optimal noise. A thorough investigation on a custom benchmark helps identif ...

Read full abstract

y several key challenges involved in learning from high-noise datasets. We re-purpose prioritized experience sampling to locate expert-level demonstrations among millions of low-performance samples. This modification enables offline agents to learn state-of-the-art policies in benchmark tasks using datasets where expert actions are outnumbered nearly 65:1.

Technical details

Canonical key: arxiv-2110.04698

Cache status: Fresh

Generated at: Jun 12, 2026, 4:38 AM

Artifact coverage: sparse

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few days

1 risk flag

pytorch

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Recent Offline Reinforcement Learning methods have succeeded in learning high-performance policies from fixed datasets of experience.

Use This Implementation Because…

Confidence: low

jakegrigsby/super_sac is the closest maintained adjacent implementation (Official implementation from Papers with Code). It is not paper-verified; validate algorithm and evaluation setup against the paper before trusting reported metrics. Community adoption signal: 42 GitHub stars.

Open jakegrigsby/deep_control

Reproduction Risks

Adjacent implementations are not paper-verified
Recommended repository is adjacent and not paper-verified.
Adjacent implementation match confidence is low.
No direct maintained implementation is currently verified.

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 65/100, grounding 75/100, status medium.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

jakegrigsby/deep_control

historical official

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 105
Last push: Dec 31, 2021 (1625d ago)

Dependencies

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

jakegrigsby/super_sac

alternative

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 42
Last push: Mar 15, 2024 (819d ago)

Dependencies

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

jakegrigsby/cc-afbc

alternative

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 4
Last push: Dec 5, 2021 (1650d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

Best implementation now

Only a historical official implementation is available.

Use with caution for new projects; verify against current tooling and maintained community alternatives.

jakegrigsby/deep_control

Historical official

Stars: 105

Last push: Dec 31, 2021

Only historical official repository was found: jakegrigsby/deep_control.
No maintained paper-verified implementation met reliability thresholds.

Reproduction readiness

Setup Required

Time to first repro: days

Last checked: Jun 12, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Dependencies pinned, manual setup needed

· jakegrigsby/deep_control has requirements.txt but requires manual environment setup.
· Last push was 1625 days ago — expect possible dependency version conflicts.
· No Dockerfile — you will set up the environment manually.
· No CI pipeline — test coverage is unknown.

Open jakegrigsby/deep_control

Quick start

git clone https://github.com/jakegrigsby/deep_control.git
pip install -r requirements.txt

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.

Closest related implementations

These are not paper-verified. Use them as reference points when no direct implementation is available.

jakegrigsby/super_sac

Adjacent

Confidence: Low

Stars: 42

Official implementation from Papers with Code

Additional implementations

Official

jakegrigsby/super_sac
Confidence: High

A general model-free off-policy actor-critic implementation. Continuous and Discrete Soft Actor-Critic with multimodal observations, data augmentation, offline learning and behavioral cloning.

Stars: 42

Forks: 5

Last push: Mar 15, 2024
jakegrigsby/cc-afbc
Confidence: High

Advantage-Filtered Behavioral Cloning for Offline Continuous Control

Stars: 4

Forks: 1

Last push: Dec 5, 2021

License: MIT