What framework is used to implement "Training language models to follow instructions with human feedback"?

The primary implementation uses none.

Training language models to follow instructions with human feedback

Q: How reproducible is "Training language models to follow instructions with human feedback"?

Estimated time to first reproduction: a few days. Risk flags: Adjacent implementations are not paper-verified. No maintained paper-verified implementation was found; start with the closest related repositories below.

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe

Published: Mar 4, 2022

Historical official implementation (not recommended for new builds)

Evidence: Historical

Domain fit: AI-core

Verified repos: 1

Top repo stars: 1,257

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: none

Time to first repro: a few days

1 risk flag

arXiv PDF

Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. S ...

Read full abstract

tarting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.

Technical details

Canonical key: arxiv-2203.02155

Cache status: Fresh

Generated at: May 9, 2026, 4:07 AM

Artifact coverage: sparse

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few days

1 risk flag

none

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Making language models bigger does not inherently make them better at following a user's intent.

Use This Implementation Because…

Confidence: medium

haotian-liu/LLaVA is the closest maintained adjacent implementation (Matches contextual method/domain keyword: instruction tuning). It is not paper-verified; validate algorithm and evaluation setup against the paper before trusting reported metrics. Community adoption signal: 24766 GitHub stars.

Open openai/following-instructions-human-feedback

Reproduction Risks

Adjacent implementations are not paper-verified
Recommended repository is adjacent and not paper-verified.
No direct maintained implementation is currently verified.

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 65/100, grounding 75/100, status medium.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

ggerganov/llama.cpp

alternative

Maintenance: Active

Confidence: Low

Reproducibility: Strong

Community adoption signal (109104 stars)

Stars: 109,104
Last push: May 9, 2026 (1d ago)

CIReleasesDependencies

Risk flags

No Docker setup
Low confidence match

ggml-org/llama.cpp

alternative

Maintenance: Active

Confidence: Low

Reproducibility: Strong

Community adoption signal (109104 stars)

Stars: 109,104
Last push: May 9, 2026 (1d ago)

CIReleasesDependencies

Risk flags

No Docker setup
Low confidence match

openai/following-instructions-human-feedback

historical official

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 1,257
Last push: Dec 11, 2022 (1245d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

Best implementation now

Only a historical official implementation is available.

Use with caution for new projects; verify against current tooling and maintained community alternatives.

openai/following-instructions-human-feedback

Historical official

Stars: 1,257

Last push: Dec 11, 2022

Only historical official repository was found: openai/following-instructions-human-feedback.
No maintained paper-verified implementation met reliability thresholds.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: May 9, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· openai/following-instructions-human-feedback has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.
· Last push was 1245 days ago.

Open openai/following-instructions-human-feedback

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.

Closest related implementations

These are not paper-verified. Use them as reference points when no direct implementation is available.

haotian-liu/LLaVA

Adjacent

Confidence: Medium

Stars: 24,766

Matches contextual method/domain keyword: instruction tuning

Additional implementations

No additional verified repositories beyond the primary recommendation.

Possible but unverified matches (8)

These repositories had low-confidence matching signals and are hidden by default.

Showing top 6 by score. 2 additional low-confidence matches are hidden.

ggerganov/llama.cpp

Confidence: Low

Stars: 109,104
ggml-org/llama.cpp

Confidence: Low

Stars: 109,104
hiyouga/llama-efficient-tuning

Confidence: Low

Stars: 71,062
laion-ai/open-assistant

Confidence: Low

Stars: 37,414
tatsu-lab/alpaca_farm

Confidence: Low

Stars: 845
tatsu-lab/linguistic_calibration

Confidence: Low

Stars: 29

Hugging Face artifacts

No trustworthy direct or curated related Hugging Face artifacts were found yet.

Continue with targeted Hugging Face searches derived from the paper title and method context:

Models

arxiv:2203.02155 Reinforcement learning Instruction tuning

Datasets

arxiv:2203.02155 Instruction tuning dataset Reinforcement learning benchmark

Spaces

arxiv:2203.02155 Instruction tuning demo Reinforcement learning gradio

Tip: start with models, then check datasets/spaces if you need evaluation data or demos.

Direct artifact matches are currently sparse. Use targeted Hugging Face searches to quickly locate candidate models, datasets, and demos.

Search models Search datasets Search spaces

Research context

Tasks

Instruction tuning

Methods

Reinforcement learning

Domains

Natural Language Processing, Large Language Models

Evaluation & Human Feedback Data

Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.

Open in HFEPX

Explore Similar Papers

Jump to Paper2Code search queries derived from this paper's research context.

Instruction tuning Reinforcement learning Natural Language Processing Large Language Models

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.

Post a Job Get a Quote