What is the best open-source implementation of "MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine"?

The best maintained implementation is zrrskywalker/mavis with 155 stars on GitHub. Confidence: high. Reproducibility: Limited.

What framework is used to implement "MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine"?

The primary implementation uses none.

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

Q: How reproducible is "MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine"?

Estimated time to first reproduction: a few days. Risk flags: No CI workflows detected, Dependency manifest is missing. Start with zrrskywalker/mavis and validate setup instructions in README.

Published: Jul 1, 2024

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 155

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: none

Time to first repro: a few days

2 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2407.08739

Cache status: Stale (SWR served)

Generated at: May 23, 2026, 10:47 AM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few days

2 risk flags

none

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine focuses on instruction tuning.

Use This Implementation Because…

Confidence: high

zrrskywalker/mavis is the strongest maintained implementation based on ranking signals. License is declared (MIT).

Open zrrskywalker/mavis

Reproduction Risks

No CI workflows detected
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 65/100, grounding 75/100, status medium.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

zrrskywalker/mavis

best maintained

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 155
Last push: Dec 5, 2024 (535d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

ziyuguo99/image-generation-cot

alternative

Maintenance: Recently updated

Confidence: Low

Reproducibility: Limited

Community adoption signal (866 stars)

Stars: 866
Last push: Mar 19, 2026 (66d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

caraj7/t2i-r1

alternative

Maintenance: Stale risk

Confidence: Low

Reproducibility: Limited

Community adoption signal (433 stars)

Stars: 433
Last push: Sep 18, 2025 (248d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Best implementation now

zrrskywalker/mavis

Confidence: High

Reproducibility: Limited

[ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models

Stars: 155

Forks: 1

Last push: Dec 5, 2024

License: MIT

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (155 stars)

License ✓

CI –

Deps –

Docker –

Selected zrrskywalker/mavis as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: May 23, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· zrrskywalker/mavis has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.
· Last push was 535 days ago.

Open zrrskywalker/mavis

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.

Additional implementations

No additional verified repositories beyond the primary recommendation.

Possible but unverified matches (3)

These repositories had low-confidence matching signals and are hidden by default.

ziyuguo99/image-generation-cot

Confidence: Low

Stars: 866
caraj7/t2i-r1

Confidence: Low

Stars: 433
autodriving-heart/ICLR2025-Papers-about-Autonomous-Driving-and-Embodied-AI

Confidence: Low

Stars: 8

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

No trustworthy model matches right now.

Search models on Hugging Face

Datasets

andycoco1128/Trendyol-Cybersecurity-Instruction-Tuning-Dataset

Curated Related

Downloads: 60

Likes: 1

Updated: May 17, 2026

Broaden dataset search

Transformer Instruction tuning dataset Transformer Large Language Models dataset Instruction tuning dataset

Spaces

instruction-tuning-sd/instruction-tuned-sd

Curated Related

Likes: 5

Broaden demo search

Transformer Instruction tuning demo Transformer Large Language Models demo Instruction tuning demo

Explore on Hugging Face

Search models Search datasets Search spaces

Research context

Tasks

Instruction tuning

Methods

Transformer

Domains

Large Language Models

Evaluation & Human Feedback Data

Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.

Open in HFEPX

Explore Similar Papers

Jump to Paper2Code search queries derived from this paper's research context.

Instruction tuning Transformer Large Language Models

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.

Post a Job Get a Quote