What is the best open-source implementation of "DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs"?

The best maintained implementation is maximecb/gym-miniworld with 770 stars on GitHub. Confidence: high. Reproducibility: Moderate.

Are there pretrained models available for "DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs"?

Yes, 1 Hugging Face model found. The top result is AdityaaXD/Multi-Agent_Reinforcement_Learning_Trading_System_Models with 39 downloads.

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs

Q: How reproducible is "DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs"?

Estimated time to first reproduction: a few days. Risk flags: Dependency manifest is missing. Start with maximecb/gym-miniworld and validate setup instructions in README.

Q: What framework is used to implement "DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs"?

The primary implementation uses pytorch.

Published: Oct 1, 2020

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 770

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: pytorch

Time to first repro: a few days

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-2010.08891

Cache status: Stale (SWR served)

Generated at: Jun 19, 2026, 2:58 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few days

1 risk flag

pytorch

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs presents a reinforcement learning method.

Use This Implementation Because…

Confidence: high

maximecb/gym-miniworld is the strongest maintained implementation based on ranking signals. CI workflows are present. License is declared (Apache-2.0).

Open maximecb/gym-miniworld

Reproduction Risks

Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 60/100, grounding 85/100, status medium.

Implementation Comparison

Top 1 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

maximecb/gym-miniworld

best maintained

Maintenance: Recently updated

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 770
Last push: Mar 2, 2026 (115d ago)

CI Releases

Risk flags

No Docker setup
Dependency manifest missing

Best implementation now

maximecb/gym-miniworld

Confidence: High

Reproducibility: Moderate

Simple and easily configurable 3D FPS-game-like environments for reinforcement learning

Stars: 770

Forks: 144

Last push: Mar 2, 2026

License: Apache-2.0

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Partial overlap with paper title keywords

Community adoption signal (770 stars)

License ✓

CI ✓

Deps –

Docker –

Selected maximecb/gym-miniworld as the strongest maintained implementation for new work.
Includes CI workflow signals.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Jun 19, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· maximecb/gym-miniworld has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.

Open maximecb/gym-miniworld

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.