AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix

Q: What is the best open-source implementation of "AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix"?

The best maintained implementation is intelligent-machine-learning/dlrover with 1,641 stars on GitHub. Confidence: high. Reproducibility: Moderate.

Q: How reproducible is "AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix"?

Estimated time to first reproduction: a few days. Risk flags: Dependency manifest is missing. Start with intelligent-machine-learning/dlrover and validate setup instructions in README.

Q: Are there pretrained models available for "AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix"?

Yes, 1 Hugging Face model found. The top result is gradientai/Llama-3-8B-Instruct-Gradient-1048k with 25,583 downloads.

Q: What framework is used to implement "AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix"?

The primary implementation uses pytorch.

Published: Dec 1, 2023

Best maintained implementation now

Evidence: Direct

Domain fit: AI-adjacent

Verified repos: 2

Top repo stars: 1,641

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Framework: pytorch

Time to first repro: a few days

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-2312.01658

Cache status: Stale (SWR served)

Generated at: Apr 3, 2026, 12:45 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

Researcher verdict

Useful paper, but implementation path is weak

implementation starting point

Benchmark trust: thin evidence

This page is best used as a cautious implementation starting point. A concrete repo path exists, but benchmark grounding is still too thin to treat the page as a reliable benchmark reference.

Why this page is still worth reading

A concrete repository path exists via intelligent-machine-learning/dlrover, so this page can act as a practical starting point.
Reproduction risks are surfaced explicitly, which helps decide whether the paper is worth immediate prototyping.

Benchmark trust

Some benchmark signal exists in the extracted evidence, but it is not structured strongly enough yet for a confident benchmark decision.

Use this page as

Use this page to start from the best available repo path, but validate benchmark claims separately before treating it as a trusted baseline.

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

Auto-switchable Optimizer Stepwise Gradient Difference Preconditioning

IWSLT

Perplexity

Source: paper fulltext

Auto-switchable Optimizer Stepwise Gradient Difference Preconditioning

CIFAR-10

Top-1 Accuracy

92.14

Source: paper fulltext

Benchmark evidence drill-down

2 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Auto-switchable Optimizer Stepwise Gradient Difference Preconditioning	IWSLT	Perplexity	14	paper-derived	No explicit refs
Auto-switchable Optimizer Stepwise Gradient Difference Preconditioning	CIFAR-10	Top-1 Accuracy	92.14	paper-derived	No explicit refs

AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

intelligent-machine-learning/dlrover is the strongest maintained implementation based on ranking signals. CI workflows are present. License is declared (NOASSERTION).

Open intelligent-machine-learning/dlrover

Reproduction Risks

Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 95/100, grounding 95/100, status high.

Implementation Comparison

Top 2 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

intelligent-machine-learning/dlrover

best maintained

Maintenance: Active

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 1,641
Last push: Apr 2, 2026 (2d ago)

CIReleases

Risk flags

No Docker setup
Dependency manifest missing

intelligent-machine-learning/atorch

historical official

Maintenance: Stale risk

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 59
Last push: Aug 13, 2025 (234d ago)

Risk flags

No tagged releases
No Docker setup
Dependency manifest missing

What is known right now

Concise audit mode

This page is not strong enough for a full AI-written research brief yet, so the summary is reduced to what is evidenced, what is missing, and what to do next.

What is known

AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix is the primary contribution described in this paper.
Benchmark anchor: Auto-switchable Optimizer Stepwise Gradient Difference Preconditioning on IWSLT using Perplexity.
Implementation candidate: intelligent-machine-learning/dlrover.

What is missing

Benchmark evidence is not yet strong enough to treat the LLM brief as fully researcher-ready.

What to do next

Start with intelligent-machine-learning/dlrover and validate setup instructions in README.
Reproduce the baseline result with the provided defaults before modifying hyperparameters.
Log exact dependency versions and runtime environment for reproducibility.

Best implementation now

intelligent-machine-learning/dlrover

Confidence: High

Reproducibility: Moderate

DLRover: An Automatic Distributed Deep Learning System

Stars: 1,641

Forks: 213

Last push: Apr 2, 2026

License: NOASSERTION

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Community adoption signal (1641 stars)

License ✓

CI ✓

Deps –

Docker –

Selected intelligent-machine-learning/dlrover as the strongest maintained implementation for new work.
Includes CI workflow signals.
Repository activity is within the last 24 months.
Official repository is preserved separately as historical context.

Historical official implementation

Preserved for provenance. Not recommended as the default path for new builds.

intelligent-machine-learning/atorch

Stars: 59

Last push: Aug 13, 2025

Reproduction path

Direct

Follow the direct implementation path

1

Start with intelligent-machine-learning/dlrover and validate setup instructions in README.
2

Reproduce the baseline result with the provided defaults before modifying hyperparameters.
3

Log exact dependency versions and runtime environment for reproducibility.

Framework baselines