Are there pretrained models available for "Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models"?

Yes, 1 Hugging Face model found. The top result is HaoyiSun/Switch-KD-Qwen2.5-CLIP-1.8B with 52 downloads.

Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models

Q: How reproducible is "Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models"?

Estimated time to first reproduction: a few days. Risk flags: No repository-level reproducibility signals are currently available, Estimate is based on paper-only reproduction flow. No direct maintained implementation was found. Use the paper PDF and citation graph to design a baseline reproduction.

Haoyi Sun, Xiaoxiao Wang, Ning Mao, Qian Wang, Lifu Mu, Wen Zheng, Tao Wei, Wei Chen

Published: Apr 16, 2026

No direct paper-linked artifacts found; showing strongest related artifacts

Evidence: Curated Related

Domain fit: AI-adjacent

Verified repos: 0

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Time to first repro: a few days

2 risk flags

arXiv PDF

Vision-Language Models (VLMs) have shown remarkable capabilities in joint vision-language understanding, but their large scale poses significant challenges for deployment in resource-constrained scenarios. Knowledge Distillation (KD) offers a viable way to improve model capabilities without increasing model size or data requirements, making deployment more efficient. However, applying KD to VLMs is challenged by moda ...

Read full abstract

lity-specific supervision: although multimodal knowledge in VLMs is fused within the language space, current methods supervise each modality separately without explicitly addressing multimodal alignment, leading to inconsistent multimodal knowledge transfer. To address this, we propose Switch-KD, a visual-switch distillation framework that unifies vision-language knowledge transfer within a shared text-probability space. Switch-KD comprises two key components: (1) Visual-Switch Distillation, which switches the student's visual outputs into the teacher's language pathway to construct cross-modal probabilistic references for implicit visual knowledge transfer; and (2) Dynamic Bi-directional Logits Difference (DBiLD) loss, which adaptively aligns informative probability regions while preserving the distributional structures of teacher and student through bidirectional supervision. Guided by Switch-KD, a 0.5B TinyLLaVA effectively distills rich multimodal knowledge from its 3B teacher, yielding an average improvement of 3.6 points across 10 multimodal benchmarks without any architectural modification.

Technical details

Canonical key: arxiv-2604.14629

Cache status: Stale (SWR served)

Generated at: Jun 16, 2026, 10:12 PM

Artifact coverage: curated_related

HF provider: ok (token)

PWC source used: No

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

context only

Benchmarks: missing

Time to repro: a few days

2 risk flags

Results & Benchmarks

Freshness tier: warm

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Implementation Evidence Summary

Confidence: low

Recommendation evidence is currently too limited for a maintained-repo choice. Use Implementation Status and Reproduction Path for a practical baseline plan.

Reproduction Risks

Estimate is based on paper-only reproduction flow

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 2 links.

Utility signals: depth 60/100, grounding 68/100, status medium.

Implementation Comparison

Top 2 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

haoyi199815/Switch-KD

alternative

Maintenance: Recently updated

Confidence: Low

Reproducibility: Limited

Matched via arXiv identifier search

Stars: 12
Last push: Apr 23, 2026 (56d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

nktkt/switch-kd-vlm

alternative

Maintenance: Recently updated

Confidence: Medium

Reproducibility: Limited

Matched via arXiv identifier search · Strong overlap with paper title keywords

Stars: 0
Last push: Apr 17, 2026 (62d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Implementation Status

No verified maintained repo

There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.

No direct maintained implementation was found. Use the paper PDF and citation graph to design a baseline reproduction.
Track assumptions and missing details in an experiment log before coding.

Time to first repro: a few days

Best available artifact: HaoyiSun/Switch-KD-Qwen2.5-CLIP-1.8B