Arcee Trinity Large Technical Report
Varun Singh, Lucas Krauss, Sami Jaghouar, Matej Sirovatka, Charles Goddard +21 more
Abstract
We present the technical report for Arcee Trinity Large, a sparse Mixture-of-Experts model with 400B total parameters and 13B activated per token. Additionally, we report on Trinity Nano and Trinity Mini, with Trinity Nano having 6B total parameters with 1B activated per token, Trinity Mini having 26B total parameters with 3B activated per token. The models' modern architecture includes interleaved local and global a...
Results & Benchmarks
| Task | Dataset | Metric | Value |
|---|---|---|---|
| Transformer | C4 (zh) C/T | DeepSeek R1 | 1.54 |
Hardware Requirements
- Expect multi-day setup/compute for meaningful reproduction based on current guidance.
Best Implementation
Maintained implementation evidence is not confirmed for this paper yet.
Use the Implementation Status and Reproduction Path sections below for the current action plan.
Reproduction Path
Follow this baseline workflow to decide if this paper is worth immediate prototyping.
- 1
No maintained paper-verified implementation was found; start with the closest related repositories below.
- 2
Compare repo methods against the paper equations/algorithm before trusting metrics.
- 3
Create a minimal baseline implementation from the paper and use adjacent repos as references.
Framework baselines
- PyTorch Adam optimizer docs
Reference implementation of Adam in PyTorch.
- Optax Adam optimizer docs
JAX/Flax baseline for Adam variants.
- Keras Adam optimizer docs
TensorFlow/Keras baseline for Adam.
- Hugging Face Transformers training guide
Modern transformer training baseline.
Related Implementations
These are not paper-verified. Use them as reference points when no direct implementation is available.
Matches contextual method/domain keyword: transformer
Additional Implementations
No additional verified repositories beyond the primary recommendation.
Hugging Face Artifacts
No direct paper-linked artifacts were found. Showing strongest curated related artifacts.
- arcee-ai/Trinity-Large-Thinking12.7k 141
- arcee-ai/Trinity-Large-Preview1.3k 171
- arcee-ai/Trinity-Large-Preview-W4A1617.2k 6