What is the best open-source implementation of "MVBench: A Comprehensive Multi-modal Video Understanding Benchmark"?

The best maintained implementation is opengvlab/ask-anything with 3,341 stars on GitHub. Confidence: high. Reproducibility: Limited.

What framework is used to implement "MVBench: A Comprehensive Multi-modal Video Understanding Benchmark"?

The primary implementation uses pytorch.

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

Q: How reproducible is "MVBench: A Comprehensive Multi-modal Video Understanding Benchmark"?

Estimated time to first reproduction: a few days. Risk flags: No CI workflows detected, Dependency manifest is missing. Start with opengvlab/ask-anything and validate setup instructions in README.

Published: Nov 1, 2023

Best maintained implementation now

Evidence: Direct

Domain fit: AI-adjacent

Verified repos: 1

Top repo stars: 3,341

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Framework: pytorch

Time to first repro: a few days

2 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2311.17005

Cache status: Fresh

Generated at: Jun 17, 2026, 6:32 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few days

2 risk flags

pytorch

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

Video understanding / reasoning

Best option: (

Answer Prompt

100

Source: paper fulltext

Benchmark evidence drill-down

1 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Video understanding / reasoning	Best option: (	Answer Prompt	100	paper-derived	No explicit refs

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark focuses on video understanding / reasoning.

Use This Implementation Because…

Confidence: high

opengvlab/ask-anything is the strongest maintained implementation based on ranking signals. License is declared (MIT).

Open opengvlab/ask-anything

Reproduction Risks

No CI workflows detected
Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 100/100, grounding 85/100, status high.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

opengvlab/ask-anything

best maintained

Maintenance: Stale

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 3,341
Last push: Jan 18, 2025 (516d ago)

Risk flags

No push in 12+ months
No CI pipeline detected
No tagged releases

bytedance/tarsier

alternative

Maintenance: Stale risk

Confidence: Low

Reproducibility: Moderate

Partial overlap with paper title keywords · Community adoption signal (549 stars)

Stars: 549
Last push: Aug 14, 2025 (308d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

magic-research/PLLaVA

alternative

Maintenance: Archived

Confidence: Low

Reproducibility: Limited

Community adoption signal (670 stars) · Repository is archived

Stars: 670
Last push: Jul 28, 2024 (690d ago)

Dependencies

Risk flags

Repository archived
No push in 12+ months
No CI pipeline detected

Best implementation now

opengvlab/ask-anything

Confidence: High

Reproducibility: Limited

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Stars: 3,341

Forks: 270

Last push: Jan 18, 2025

License: MIT

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Partial overlap with paper title keywords

Community adoption signal (3341 stars)

License ✓

CI –

Deps –

Docker –

Selected opengvlab/ask-anything as the strongest maintained implementation for new work.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Jun 17, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· opengvlab/ask-anything has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.
· Last push was 516 days ago.

Open opengvlab/ask-anything

Additional implementations

No additional verified repositories beyond the primary recommendation.

Possible but unverified matches (7)

These repositories had low-confidence matching signals and are hidden by default.

Showing top 6 by score. 1 additional low-confidence matches are hidden.

bytedance/tarsier

Confidence: Low

Stars: 549
magic-research/PLLaVA

Confidence: Low

Stars: 670
XiaomingX/CVPR2024-Papers-with-Code

Confidence: Low

Stars: 9
bluuueheart/ReTool-Video

Confidence: Low

Stars: 0
thucdangvan020999/videochat2

Confidence: Low

Stars: 0
Jayaprakash2k3/ClipQuery

Confidence: Low

Stars: 0

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

No trustworthy model matches right now.

Search models on Hugging Face

Datasets

OpenGVLab/MVBench

Curated Related

Downloads: 40,240

Likes: 44

Updated: Oct 18, 2024
VLM2Vec/MVBench

Curated Related

Downloads: 3,255

Likes: 0

Updated: Aug 15, 2025