What is the best open-source implementation of "RLDX-1 Technical Report"?

The best maintained implementation is RLWRLD/RLDX-1 with 122 stars on GitHub. Confidence: medium. Reproducibility: Strong.

How reproducible is "RLDX-1 Technical Report"?

Estimated time to first reproduction: a few hours. No risk flags identified. Start with RLWRLD/RLDX-1 and validate setup instructions in README.

RLDX-1 Technical Report

Dongyoung Kim, Huiwon Jang, Myungkyu Koo, Suhyeok Jang, Taeyoung Kim, Beomjun Kim, Byungjun Yoon, Changsung Jang, Daewon Choi, Dongsu Han, Donguk Lee, Heeseung Kwon, Hojin Jeon, Jaehyun Kang, Jaekyoung Bae, Jihyuk Lee, Jimin Lee, John Won, Joonwoo Ahn, Junhyeong Park, Junyoung Sung, Kyungmin Lee, Minseong Han, Minsung Yoon, Sejune Joo, Seonil Son, Seungcheol Park, Seunggeun Cho, Seungjun Moon, Seungku Kim, Yonghoon Dong, Yongjin Cho, Youngchan Kim, Chang Hwan Kim, Dohyeon Kim, Heecheol Kim, Heewon Lee, Hensen Ahn, Hyungkyu Ryu, Hyunsoo Choi, Hyunsoo Shin, Jaeheon Jung, Jaewoo Kim, Jinwook Kim, Joochul Chang, Joonsoo Kim, Junghun Park, Jungwoo Park, Junho Cho, Junhyeok Park, Junwon Lee, Kangwook Lee, Kwanghoon Kim, Kyoungwhan Choe, Manoj Bhadu, Nayoung Oh, Sangjun Kim, Sangwoo Kim, Seunghoon Shim, Seunghyun Kim, Seungjun Lee, Seungyup Ka, Sungryol Yang, Wook Jung, Yashu Shukla, Yeonjae Lee, Yeonwoo Bae, Jinwoo Shin

Published: May 5, 2026

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 122

Core AI workload signals detected from paper context and implementation/artifact evidence.

Time to first repro: a few hours

No risk flags

arXiv PDF

While Vision-Language-Action models (VLAs) have shown remarkable progress toward human-like generalist robotic policies through the versatile intelligence (i.e. broad scene understanding and language-conditioned generalization) inherited from pre-trained Vision-Language Models, they still struggle with complex real-world tasks requiring broader functional capabilities (e.g. motion awareness, long-term memory, and phy ...

Read full abstract

sical sensing). To address this, we introduce RLDX-1, a general-purpose robotic policy for dexterous manipulation built on the Multi-Stream Action Transformer (MSAT), an architecture that unifies these capabilities by integrating heterogeneous modalities through modality-specific streams with cross-modal joint self-attention. RLDX-1 further combines this architecture with system-level design choices, including data synthesis for rare manipulation scenarios, learning procedures specialized for human-like manipulation, and inference optimizations for real-time deployment. Through empirical evaluation, we show that RLDX-1 consistently outperforms recent frontier VLAs (e.g. $π_{0.5}$ and GR00T N1.6) across both simulation benchmarks and real-world tasks that require broad functional capabilities beyond general versatility. In particular, RLDX-1 shows superiority in ALLEX humanoid tasks by achieving success rates of 86.8% while $π_{0.5}$ and GR00T N1.6 achieve around 40%, highlighting the ability of RLDX-1 to control a high-DoF humanoid robot under diverse functional demands. Together, these results position RLDX-1 as a promising step toward reliable VLAs for complex, contact-rich, and dynamic real-world dexterous manipulation.

Technical details

Canonical key: arxiv-2605.03269

Cache status: Fresh

Generated at: May 10, 2026, 5:27 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: No

LLM status: ready

LLM model: openai/gpt-5.1-20251113

LLM generated: May 9, 2026, 5:23 AM

LLM content type: sparse_repro_blueprint

HF policy: hf-relevance-v27

LLM evidence refs: paper.abstract, evidencePack.paperSections[id=paper_table_4], evidencePack.paperSections[id=paper_table_10], researcherSummary.reproductionRisks, guidance.riskFlags, paper.title, summary.hasReliableImplementation

implementation starting point

Benchmarks: thin evidence

Time to repro: a few hours

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Some benchmark signal exists in the extracted evidence, but it is not structured strongly enough yet for a confident benchmark decision.

Benchmark signal from claims

Empirical evaluation indicates that RLDX-1 consistently outperforms recent frontier vision-language-action models such as π₀.₅ and GR00T N1.6 on both simulation benchmarks and real-world tasks requiring broader functional capabilities.
On ALLEX humanoid tasks, RLDX-1 attains an 86.8% success rate, whereas competing VLAs π₀.₅ and GR00T N1.6 achieve only around 40% success, demonstrating substantially better high-DoF humanoid control.
On the RoboCasa Kitchen benchmark, selecting layer 18 of the vision-language backbone, as used in RLDX-1, yields a reported success rate of 60.9%, outperforming alternative layer choices such as layer 8 and 28.

While Vision-Language-Action models (VLAs) have shown remarkable progress toward human-like generalist robotic policies through the versatile intelligence (i.e.

Use This Implementation Because…

Confidence: medium

RLWRLD/RLDX-1 is the best available implementation candidate based on ranking signals, but recommendation confidence is not yet high. CI workflows are present. License is declared (Apache-2.0).

Open RLWRLD/RLDX-1

Reproduction Risks

No repository-level red flags were detected, but paper-specific preprocessing and hyperparameter details may still be under-specified.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 55/100, grounding 75/100, status medium.

Implementation Comparison

Top 1 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

RLWRLD/RLDX-1

best maintained

Maintenance: Active

Confidence: Medium

Reproducibility: Strong

Matched via arXiv identifier search · Strong overlap with paper title keywords

Stars: 122
Last push: May 7, 2026 (3d ago)

CIDependencies

Risk flags

No tagged releases
No Docker setup

Best implementation now

RLWRLD/RLDX-1

Confidence: Medium

Reproducibility: Strong

RLWRLD/RLDX-1

Stars: 122

Forks: 4

Last push: May 7, 2026

License: Apache-2.0

Matched via arXiv identifier search

Strong overlap with paper title keywords

Community adoption signal (122 stars)

License ✓

CI ✓

Deps ✓

Docker –

Selected RLWRLD/RLDX-1 as the strongest maintained implementation for new work.
Includes CI workflow signals.
Includes dependency/environment manifest signals.
Repository activity is within the last 24 months.

Reproduction readiness

Ready to Run

Time to first repro: hours

Last checked: May 10, 2026

Ready to reproduce

· Clone RLWRLD/RLDX-1 and install dependencies from pyproject.toml.
· CI pipeline detected — automated tests are in place.
· Last updated 3 days ago.

Open RLWRLD/RLDX-1

Quick start

git clone https://github.com/RLWRLD/RLDX-1.git
pip install -e .

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

No trustworthy model matches right now.

Search models on Hugging Face

Datasets

electroglyph/technical

Curated Related

Downloads: 324

Likes: 4

Updated: Dec 27, 2025
devrahulbanjara/ne-en-codeswitching-asr-technical-interview

Curated Related

Downloads: 75

Likes: 3

Updated: Mar 7, 2026