What framework is used to implement "d$^2$Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching"?

The primary implementation uses Hugging Face Transformers training guide.

d$^2$Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching

Q: How reproducible is "d$^2$Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching"?

Estimated time to first reproduction: a few days. Risk flags: No repository-level reproducibility signals are currently available, Estimate is based on paper-only reproduction flow. No direct maintained implementation was found. Use the paper PDF and citation graph to design a baseline reproduction.

Yuchu Jiang, Yue Cai, Xiangzhong Luo, Jiale Fu, Jiarui Wang, Chonghan Liu, Xu Yang

Published: Sep 27, 2025

No direct implementation yet

Evidence: Inferred

Domain fit: AI-core

Verified repos: 1

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: Hugging Face Transformers training guide

Time to first repro: a few days

2 risk flags

arXiv PDF

Diffusion-based large language models (dLLMs), despite their promising performance, still suffer from inferior inference efficiency. This is because dLLMs rely on bidirectional attention and cannot directly benefit from the standard key-value (KV) cache as autoregressive models (ARMs) do. To tackle this issue, we introduce \textit{Dual aDaptive Cache} (d$^2$Cache), which is a training-free approximate KV cache framew ...

Read full abstract

ork for accelerating dLLM inference. d$^2$Cache features a two-stage fine-grained selection strategy to identify tokens and adaptively update their KV states at each decoding step, while caching the KV states of the remaining tokens for reuse. Furthermore, d$^2$Cache naturally offers a more reliable decoding alternative, which can enable quasi left-to-right generation and mitigate premature overconfidence in tokens at the end of the sequence. Extensive experimental results on two representative dLLMs (\ie, LLaDA and Dream) demonstrate that d$^2$Cache not only achieves substantial inference speedups, but also yields consistent improvements in generation quality. The code is available at https://github.com/Kamichanw/d2Cache.

Technical details

Canonical key: arxiv-2509.23094

Cache status: Fresh

Generated at: Apr 15, 2026, 2:07 PM

Artifact coverage: sparse

HF provider: ok (token)

PWC source used: No

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

context only

Benchmarks: thin evidence

Time to repro: a few days

2 risk flags

Hugging Face Transformers training guide

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Generation

= 1024

Score ↑

68.46

Source: paper fulltext

Generation

Fast dLLM

Score ↑

19.39

Source: paper fulltext

Benchmark evidence drill-down

2 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Generation	= 1024	Score ↑	68.46	paper-derived	No explicit refs
Generation	Fast dLLM	Score ↑	19.39	paper-derived	No explicit refs

Diffusion-based large language models (dLLMs), despite their promising performance, still suffer from inferior inference efficiency.

Implementation Evidence Summary

Confidence: low

Recommendation evidence is currently too limited for a maintained-repo choice. Use Implementation Status and Reproduction Path for a practical baseline plan.

Reproduction Risks

Estimate is based on paper-only reproduction flow

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 2 refs, 1 links.

Utility signals: depth 95/100, grounding 68/100, status medium.

Implementation Comparison

Top 1 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

Kamichanw/d2Cache

alternative

Maintenance: Recently updated

Confidence: Medium

Reproducibility: Moderate

Matched via arXiv identifier search · Strong overlap with paper title keywords

Stars: 109
Last push: Mar 11, 2026 (35d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Implementation Status

No verified maintained repo

There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.

No direct maintained implementation was found. Use the paper PDF and citation graph to design a baseline reproduction.
Start from this likely method family: Diffusion.
Track assumptions and missing details in an experiment log before coding.

Time to first repro: a few days

Reproduction readiness

No Repo

Time to first repro: days

Last checked: Apr 15, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No verified implementation available

· No maintained repository has been identified for this paper. Check adjacent implementations or HF artifacts below.

Framework baselines

Hugging Face Transformers training guide
Modern transformer training baseline.
PyTorch nn.Transformer docs
Reference transformer building block implementation.
Hugging Face Diffusers training guide
Practical baseline for diffusion model reproduction.

Additional implementations

Official

No additional official repositories detected.

Community

Kamichanw/d2Cache
Confidence: Medium

[ICLR'26] Official code of paper "d2Cache: Accelerating Diffusion-based LLMs via Dual Adaptive Caching"

Stars: 109

Last push: Mar 11, 2026

License: Apache-2.0

Hugging Face artifacts

No trustworthy direct or curated related Hugging Face artifacts were found yet.

Continue with targeted Hugging Face searches derived from the paper title and method context:

Models

arxiv:2509.23094 Diffusion-Based LLMs

Datasets

arxiv:2509.23094 Diffusion-Based dataset Diffusion benchmark

Spaces

arxiv:2509.23094 Diffusion-Based demo Diffusion gradio

Tip: start with models, then check datasets/spaces if you need evaluation data or demos.

Direct artifact matches are currently sparse. Use targeted Hugging Face searches to quickly locate candidate models, datasets, and demos.

Search models Search datasets Search spaces

Research context

Tasks

Generation

Methods

Transformer, Diffusion

Domains

Natural Language Processing

Evaluation & Human Feedback Data

Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.

Open in HFEPX

Explore Similar Papers

Jump to Paper2Code search queries derived from this paper's research context.

Generation Transformer Diffusion Natural Language Processing

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.

Post a Job Get a Quote