OpenTrain AI
Maintained implementation availablepytorchPretrained Models Available

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

June 1, 2024arXiv: 2406.16554
1 repo1,000 stars~a few hours to reproduce
arXiv PDF

Abstract

Best Implementation

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

1.0k 60 Dec 2024 Apache-2.0
License
CI
Deps
Docker
  • Selected pjlab-sys4nlp/llama-moe as the strongest maintained implementation for new work.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.

Reproduction Path

  1. 1

    Start with pjlab-sys4nlp/llama-moe and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hoursNo CI workflows detected

Additional Implementations

No additional verified repositories beyond the primary recommendation.

Hugging Face Artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts.

Curated Related