What is the best open-source implementation of "Continual Pre-Training of Large Language Models: How to (re)warm your model?"?

The best maintained implementation is eleutherai/gpt-neox with 7,423 stars on GitHub. Confidence: high. Reproducibility: Moderate.

Are there pretrained models available for "Continual Pre-Training of Large Language Models: How to (re)warm your model?"?

Yes, 3 Hugging Face models found. The top result is ostris/zimage_turbo_training_adapter with 49,471 downloads.

What framework is used to implement "Continual Pre-Training of Large Language Models: How to (re)warm your model?"?

The primary implementation uses pytorch.

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Q: How reproducible is "Continual Pre-Training of Large Language Models: How to (re)warm your model?"?

Estimated time to first reproduction: a few days. Risk flags: Dependency manifest is missing. Start with eleutherai/gpt-neox and validate setup instructions in README.

Published: Aug 1, 2023

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 7,423

Framework: pytorch

Time to first repro: a few days

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-2308.04014

Cache status: Fresh

Generated at: Apr 25, 2026, 5:03 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: ready

LLM model: openai/gpt-5.1-20251113

LLM generated: Apr 25, 2026, 5:05 AM

LLM content type: researcher_benchmark_brief

HF policy: hf-relevance-v27

LLM evidence refs: paper.paperSections[id=paper_table_1], paper.paperSections[id=paper_4], researcherSummary.benchmarkSnapshot[0], paper.paperSections[id=paper_caption_2], paper.paperSections[id=paper_caption_3], paper.paperSections[id=paper_caption_4], paper.paperSections[id=paper_caption_5], paper.paperSections[id=paper_caption_6], paper.paperSections[id=paper_12], guidance.riskFlags[0], researcherSummary.reproductionRisks[0], repos[0].fullName, researcherSummary.benchmarkSnapshot[1], researcherSummary.benchmarkSnapshot[2], paper.title, summary.hasReliableImplementation

implementation baseline

Benchmarks: thin evidence

Time to repro: a few days

1 risk flag

pytorch

Results & Benchmarks

Direct + Inferred Evidence

Natural language processing

StackExchange

Sampling %

2.0

Source: paper fulltext

Natural language processing

Arxiv

Sampling %

2.5

Source: paper fulltext

Natural language processing

Wikipedia

Sampling %

4.5

Source: paper fulltext

Benchmark evidence drill-down

3 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Natural language processing	StackExchange	Sampling %	2.0	paper-derived	No explicit refs
Natural language processing	Arxiv	Sampling %	2.5	paper-derived	No explicit refs
Natural language processing	Wikipedia	Sampling %	4.5	paper-derived	No explicit refs

Continual Pre-Training of Large Language Models: How to (re)warm your model? is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

eleutherai/gpt-neox is the strongest maintained implementation based on ranking signals. CI workflows are present. License is declared (Apache-2.0).

Open eleutherai/gpt-neox

Reproduction Risks

Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 95/100, grounding 95/100, status high.

Implementation Comparison

Top 2 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

eleutherai/gpt-neox

best maintained

Maintenance: Active

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 7,423
Last push: Apr 13, 2026 (12d ago)

CIReleases

Risk flags

No Docker setup
Dependency manifest missing

openaccess-ai-collective/axolotl

alternative

Maintenance: Active

Confidence: Low

Reproducibility: Strong

Community adoption signal (11756 stars)

Stars: 11,756
Last push: Apr 24, 2026 (1d ago)

CIReleasesDependencies

Risk flags

No Docker setup
Low confidence match

Best implementation now

eleutherai/gpt-neox

Confidence: High

Reproducibility: Moderate

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Stars: 7,423

Forks: 1,110

Last push: Apr 13, 2026

License: Apache-2.0

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Community adoption signal (7423 stars)

License ✓

CI ✓

Deps –

Docker –

Selected eleutherai/gpt-neox as the strongest maintained implementation for new work.
Includes CI workflow signals.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Apr 25, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· eleutherai/gpt-neox has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.

Open eleutherai/gpt-neox