What is the best open-source implementation of "CodePDE: An Inference Framework for LLM-driven PDE Solver Generation"?

The best maintained implementation is lithiumda/codepde with 71 stars on GitHub. Confidence: high. Reproducibility: Limited.

Are there pretrained models available for "CodePDE: An Inference Framework for LLM-driven PDE Solver Generation"?

Yes, 3 Hugging Face models found. The top result is deepseek-ai/deepseek-llm-7b-chat with 140,167 downloads.

What framework is used to implement "CodePDE: An Inference Framework for LLM-driven PDE Solver Generation"?

The primary implementation uses jax.

CodePDE: An Inference Framework for LLM-driven PDE Solver Generation

Q: How reproducible is "CodePDE: An Inference Framework for LLM-driven PDE Solver Generation"?

Estimated time to first reproduction: a few hours. Risk flags: License metadata missing, No CI workflows detected. Start with lithiumda/codepde and validate setup instructions in README.

Shanda Li, Tanya Marwah, Junhong Shen, Weiwei Sun, Andrej Risteski, Yiming Yang, Ameet Talwalkar

Published: May 13, 2025

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 71

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: jax

Time to first repro: a few hours

2 risk flags

arXiv PDF

Partial differential equations (PDEs) are fundamental to modeling physical systems, yet solving them remains a complex challenge. Traditional numerical solvers rely on expert knowledge to implement and are computationally expensive, while neural-network-based solvers require large training datasets and often lack interpretability. In this work, we frame PDE solving as a code generation task and introduce CodePDE, the ...

Read full abstract

first inference framework for generating PDE solvers using large language models (LLMs). With CodePDE, we present a thorough evaluation on critical capacities of LLM for PDE solving: reasoning, debugging, self-refinement, and test-time scaling. CodePDE shows that, with advanced inference-time algorithms and scaling strategies, LLMs can achieve strong performance across a range of representative PDE problems. We also identify novel insights into LLM-driven solver generation, such as trade-offs between solver reliability and sophistication, design principles for LLM-powered PDE solving agents, and failure modes for LLM on hard tasks. These insights offer guidance for building more capable and reliable LLM-based scientific engines.

Technical details

Canonical key: arxiv-2505.08783

Cache status: Fresh

Generated at: Mar 14, 2026, 10:09 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: ready

LLM model: openai/gpt-5.1-20251113

LLM generated: Mar 13, 2026, 5:06 PM

LLM content type: researcher_benchmark_brief

HF policy: hf-relevance-v27

LLM evidence refs: paper.abstract, evidencePack.paperSections[id=paper_table_1], paper.title, evidencePack.paperSections[id=paper_table_2], evidencePack.paperSections[id=paper_caption_7], guidance.riskFlags[0], guidance.riskFlags[1], repos[0].fullName, researcherSummary.reproductionRisks, evidencePack.paperSections[id=paper_caption_3], evidencePack.paperSections[id=paper_table_3], evidencePack.paperSections[id=paper_table_4], evidencePack.paperSections[id=paper_table_5], evidencePack.paperSections[id=paper_caption_17], summary.hasReliableImplementation

Researcher verdict

Recommended implementation path available

implementation baseline

Benchmark trust: thin evidence

This page has evidence-backed benchmark findings and a concrete implementation recommendation anchored on lithiumda/codepde. Use it as an implementation baseline, then validate benchmark parity before adapting it.

Why this page is still worth reading

A concrete repository path exists via lithiumda/codepde, so this page can act as a practical starting point.
Reproduction risks are surfaced explicitly, which helps decide whether the paper is worth immediate prototyping.

Benchmark trust

Some benchmark signal exists in the extracted evidence, but it is not structured strongly enough yet for a confident benchmark decision.

Use this page as

Start here when you need the most practical implementation path quickly.

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Some benchmark signal exists in the extracted evidence, but it is not structured strongly enough yet for a confident benchmark decision.

Benchmark signal from claims

The authors conduct a broad evaluation of LLM-driven PDE solvers under the CodePDE framework across multiple representative PDE problems, assessing reasoning, debugging, self-refinement, and test-time scaling behaviors.

Partial differential equations (PDEs) are fundamental to modeling physical systems, yet solving them remains a complex challenge.

Use This Implementation Because…

Confidence: high

lithiumda/codepde is the strongest maintained implementation based on ranking signals. Dependency/environment manifests are present.

Open lithiumda/codepde

Reproduction Risks

License metadata missing
No CI workflows detected

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 60/100, grounding 85/100, status medium.

Implementation Comparison

Top 1 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

lithiumda/codepde

best maintained

Maintenance: Active

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 71
Last push: Feb 12, 2026 (30d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Paper summary

AI-generated

AI-generated summary grounded in paper metadata and artifact signals.

The paper introduces CodePDE, an inference-time framework that treats PDE solving as a code generation task to automatically generate PDE solvers using large language models. This page includes benchmark evidence for PDE solving via code generation on CodePDE PDE benchmark suite (Advection, Burgers, React-Diff, CNS, Darcy). Reproduction guidance focuses on implementation viability and concrete risk controls.

Key contributions

The paper introduces CodePDE, an inference-time framework that treats PDE solving as a code generation task to automatically generate PDE solvers using large language models.
CodePDE is designed to unlock four critical LLM capabilities for PDE solving—chain-of-thought reasoning, autonomous code repair and debugging, best-of-n test-time sampling, and feedback-driven solver refinement.
The authors conduct a broad evaluation of LLM-driven PDE solvers under the CodePDE framework across multiple representative PDE problems, assessing reasoning, debugging, self-refinement, and test-time scaling behaviors.
The study reports that LLM-generated PDE solvers under CodePDE exhibit notable failure modes on harder PDE tasks, indicating reliability limitations despite strong average performance.
The authors identify a trade-off between solver reliability and sophistication in LLM-generated PDE solvers, suggesting that more advanced solver designs do not always yield more dependable behavior.

Implementation guidance

Use lithiumda/codepde first because deterministic ranking and extracted evidence align on implementation viability. Start with the repo setup path, then validate benchmark reproduction before adaptation.

Reproducibility notes

LLM-generated PDE solvers can fail on harder PDE tasks even when they perform well on simpler problems, leading to unreliable convergence or inaccurate solutions.
More sophisticated or complex LLM-generated solver designs may reduce reliability, reflecting a trade-off between advanced numerical strategies and consistent performance.
Absence of automated CI testing in the CodePDE repository increases the risk of silent breaks or regressions when dependencies or code are updated.
Lack of explicit license metadata in the repository can hinder downstream adoption or redistribution and may complicate compliant reuse of the implementation.

Best implementation now

lithiumda/codepde

Confidence: High

Reproducibility: Limited

[TMLR] CodePDE: An Inference Framework for LLM-driven PDE Solver Generation

Stars: 71

Forks: 10

Last push: Feb 12, 2026

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (71 stars)

License –

CI –

Deps ✓

Docker –

Selected lithiumda/codepde as the strongest maintained implementation for new work.
Includes dependency/environment manifest signals.
Repository activity is within the last 24 months.

Reproduction path

Direct

Follow the direct implementation path

1

Start with lithiumda/codepde and validate setup instructions in README.
2

Reproduce the baseline result with the provided defaults before modifying hyperparameters.
3

Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hours

License metadata missing

No CI workflows detected

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

deepseek-ai/deepseek-llm-7b-chat

Curated Related

Downloads: 140,167

Likes: 217
llm-jp/llm-jp-3-3.7b-instruct

Curated Related

Downloads: 810,489

Likes: 13
deepseek-ai/deepseek-llm-7b-base

Curated Related

Downloads: 34,353

Likes: 137

Broaden model search

Transformer Scientific computing Transformer Natural Language Processing Scientific computing

Datasets

tokyotech-llm/swallow-code-v2

Curated Related

Downloads: 5,595

Likes: 33

Updated: Nov 8, 2025
tokyotech-llm/swallow-math

Curated Related

Downloads: 1,740

Likes: 43

Updated: Mar 1, 2026