What is the best open-source implementation of "NanoFlow: Towards Optimal Large Language Model Serving Throughput"?

The best maintained implementation is efeslab/Nanoflow with 965 stars on GitHub. Confidence: high. Reproducibility: Limited.

What framework is used to implement "NanoFlow: Towards Optimal Large Language Model Serving Throughput"?

The primary implementation uses pytorch.

NanoFlow: Towards Optimal Large Language Model Serving Throughput

Q: How reproducible is "NanoFlow: Towards Optimal Large Language Model Serving Throughput"?

Estimated time to first reproduction: a few days. Risk flags: License metadata missing, No CI workflows detected, Dependency manifest is missing. Start with efeslab/Nanoflow and validate setup instructions in README.

Published: Aug 1, 2024

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 965

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: pytorch

Time to first repro: a few days

3 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2408.12757

Cache status: Stale (SWR served)

Generated at: Jun 17, 2026, 1:02 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few days

3 risk flags

pytorch

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

NanoFlow: Towards Optimal Large Language Model Serving Throughput is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

efeslab/Nanoflow is the strongest maintained implementation based on ranking signals.

Open efeslab/Nanoflow

Reproduction Risks

License metadata missing
No CI workflows detected
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 65/100, grounding 75/100, status medium.

Implementation Comparison

Top 2 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

efeslab/Nanoflow

best maintained

Maintenance: Recently updated

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Matched via arXiv identifier search

Stars: 965
Last push: Mar 29, 2026 (81d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

microsoft/mscclpp

alternative

Maintenance: Active

Confidence: Low

Reproducibility: Strong

Community adoption signal (532 stars)

Stars: 532
Last push: Jun 17, 2026 (1d ago)

CIReleasesDependencies

Risk flags

No Docker setup
Low confidence match

Best implementation now

efeslab/Nanoflow

Confidence: High

Reproducibility: Limited

A throughput-oriented high-performance serving framework for LLMs

Stars: 965

Forks: 49

Last push: Mar 29, 2026

Official implementation from Papers with Code

Matched via arXiv identifier search

Strong overlap with paper title keywords

Community adoption signal (965 stars)

License –

CI –

Deps –

Docker –

Selected efeslab/Nanoflow as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Jun 17, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· efeslab/Nanoflow has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.

Open efeslab/Nanoflow

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.