What is the best open-source implementation of "Text-to-Text Pre-Training for Data-to-Text Tasks"?

The best maintained implementation is google-research-datasets/ToTTo with 466 stars on GitHub. Confidence: high. Reproducibility: Limited.

How reproducible is "Text-to-Text Pre-Training for Data-to-Text Tasks"?

Estimated time to first reproduction: a few days. Risk flags: License metadata missing, No CI workflows detected, Dependency manifest is missing. Start with google-research-datasets/ToTTo and validate setup instructions in README.

Are there pretrained models available for "Text-to-Text Pre-Training for Data-to-Text Tasks"?

Yes, 3 Hugging Face models found. The top result is ostris/zimage_turbo_training_adapter with 59,064 downloads.

What framework is used to implement "Text-to-Text Pre-Training for Data-to-Text Tasks"?

The primary implementation uses none.

Text-to-Text Pre-Training for Data-to-Text Tasks

Published: May 1, 2020

Best maintained implementation now

Evidence: Direct

Domain fit: AI-adjacent

Verified repos: 1

Top repo stars: 466

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Framework: none

Time to first repro: a few days

3 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2005.10433

Cache status: Stale (SWR served)

Generated at: Jun 18, 2026, 6:27 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few days

3 risk flags

none

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

Data-to-text Tasks

SC-GPT2

BLEU

30.8

Source: paper fulltext

Data-to-text Tasks

T5-Small

BLEU

34.6

Source: paper fulltext

Benchmark evidence drill-down

2 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Data-to-text Tasks	SC-GPT2	BLEU	30.8	paper-derived	No explicit refs
Data-to-text Tasks	T5-Small	BLEU	34.6	paper-derived	No explicit refs

Text-to-Text Pre-Training for Data-to-Text Tasks is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

google-research-datasets/ToTTo is the strongest maintained implementation based on ranking signals.

Open google-research-datasets/ToTTo

Reproduction Risks

License metadata missing
No CI workflows detected
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 100/100, grounding 95/100, status high.

Implementation Comparison

Top 2 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

google-research-datasets/ToTTo

best maintained

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 466
Last push: Sep 11, 2024 (647d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

shark-nlp/cont

alternative

Maintenance: Stale

Confidence: Low

Reproducibility: Limited

Strong overlap with paper title keywords · Community adoption signal (152 stars)

Stars: 152
Last push: May 10, 2023 (1137d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

Best implementation now

google-research-datasets/ToTTo

Confidence: High

Reproducibility: Limited

ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description. We hope it can serve as a useful research benchmark for high-precision conditional text generation.

Stars: 466

Forks: 37

Last push: Sep 11, 2024

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (466 stars)

License –

CI –

Deps –

Docker –

Selected google-research-datasets/ToTTo as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Jun 18, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· google-research-datasets/ToTTo has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.
· Last push was 647 days ago.

Open google-research-datasets/ToTTo