OpenTrain AI
Maintained implementation availablepytorchPretrained Models Available

LAION-5B: An open large-scale dataset for training next generation image-text models

October 1, 2022arXiv: 2210.08402
2 repos13,658 stars~a few hours to reproduce
arXiv PDF

Abstract

Best Implementation

An open source implementation of CLIP.

13.7k 1.3k Apr 2026 NOASSERTION
License
CI
Deps
Docker
  • Selected mlfoundations/open_clip as the strongest maintained implementation for new work.
  • Includes CI workflow signals.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.

Reproduction Path

  1. 1

    Start with mlfoundations/open_clip and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hoursNo repository-level red flags were detected, but paper-specific preprocessing and hyperparameter details may still be under-specified.

Additional Implementations

Official

No additional official repositories detected.

Community

  • 🏭 Mega Scale Multimodal DataPipeline for SOTA Foundation Models

    Stars: 359Forks: 45Last push: Mar 2026License: MIT

Hugging Face Artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts.

Research Context