OpenTrain AI
Maintained implementation availablepytorch

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning

November 1, 2023arXiv: 2311.18799
2 repos11,192 stars~a few hours to reproduce
arXiv PDF

Abstract

Best Implementation

LAVIS - A One-stop Library for Language-Vision Intelligence

11.2k 1.1k Nov 2024 BSD-3-Clause
License
CI
Deps
Docker
  • Selected salesforce/lavis as the strongest maintained implementation for new work.
  • Includes CI workflow signals.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.

Reproduction Path

  1. 1

    Start with salesforce/lavis and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hoursNo repository-level red flags were detected, but paper-specific preprocessing and hyperparameter details may still be under-specified.

Additional Implementations

No additional verified repositories beyond the primary recommendation.

Hugging Face Artifacts

No trustworthy direct or curated related Hugging Face artifacts were found yet.

Research Context