Maintained implementation availablepytorchPretrained Models Available

RLHF Workflow: From Reward Modeling to Online RLHF

May 1, 2024arXiv: 2405.07863

3 repos1,524 stars~a few days to reproduce

Abstract

Task	Dataset	Metric	Value
Reinforcement learning	LLaMA-3-8B-it	GSM-8K	79.6
Reinforcement learning	Ours (SFT baseline)	GSM-8K	74.2

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Recipes to train reward model for RLHF.

1.5k 108 Apr 2025 Apache-2.0

License ✓

CI –

Deps –

Docker –

Selected RLHFlow/RLHF-Reward-Modeling as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.
Official repository is preserved separately as historical context.

1
Start with RLHFlow/RLHF-Reward-Modeling and validate setup instructions in README.
2
Reproduce the baseline result with the provided defaults before modifying hyperparameters.
3
Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few daysNo CI workflows detectedDependency manifest is missing

No additional official repositories detected.

RLHFlow/Online-RLHFConfidence: low
A recipe for online RLHF and online iterative DPO.
Stars: 543Forks: 48Last push: Dec 2024

No direct paper-linked artifacts were found. Showing strongest curated related artifacts.

Curated Related