What is the best open-source implementation of "Bridging the Gap Between Value and Policy Based Reinforcement Learning"?

The best maintained implementation is tensorflow/models with 77,663 stars on GitHub. Confidence: high. Reproducibility: Moderate.

What framework is used to implement "Bridging the Gap Between Value and Policy Based Reinforcement Learning"?

The primary implementation uses tf.

Bridging the Gap Between Value and Policy Based Reinforcement Learning

Q: How reproducible is "Bridging the Gap Between Value and Policy Based Reinforcement Learning"?

Estimated time to first reproduction: a few days. Risk flags: Dependency manifest is missing. Start with tensorflow/models and validate setup instructions in README.

Published: Feb 1, 2017

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 77,663

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: tf

Time to first repro: a few days

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-1702.08892

Cache status: Fresh

Generated at: Jun 17, 2026, 9:26 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few days

1 risk flag

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Bridging the Gap Between Value and Policy Based Reinforcement Learning presents a reinforcement learning method.

Use This Implementation Because…

Confidence: high

tensorflow/models is the strongest maintained implementation based on ranking signals. CI workflows are present. License is declared (NOASSERTION).

Open tensorflow/models

Reproduction Risks

Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 60/100, grounding 75/100, status medium.

Implementation Comparison

Top 2 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

tensorflow/models

best maintained

Maintenance: Active

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 77,663
Last push: Jun 11, 2026 (7d ago)

CIReleases

Risk flags

No Docker setup
Dependency manifest missing

UBCMOCCA/DeepReinforcementLearningPapers

alternative

Maintenance: Stale

Confidence: Low

Reproducibility: Limited

Matched via arXiv identifier search · Repository appears stale (>24 months since last push)

Stars: 10
Last push: Nov 16, 2018 (2771d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

Best implementation now

tensorflow/models

Confidence: High

Reproducibility: Moderate

Models and examples built with TensorFlow

Stars: 77,663

Forks: 45,027

Last push: Jun 11, 2026

License: NOASSERTION

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Community adoption signal (77663 stars)

License ✓

CI ✓

Deps –

Docker –

Selected tensorflow/models as the strongest maintained implementation for new work.
Includes CI workflow signals.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Jun 17, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· tensorflow/models has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.

Open tensorflow/models

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.