Results & Benchmarks
| Task | Dataset | Metric | Value |
|---|---|---|---|
| Representation learning | LLaMA-Omni2-7B | ASR-WER. | 3.26 |
| Representation learning | w/o Gate Fusion | ASR-WER. | 4.89 |
| Representation learning | w/o Text Embedding | ASR-WER. | 6.83 |
| Language modeling | Streaming TTS | ASR-WER. | 3.26 |
| Language modeling | Offline TTS | ASR-WER. | 3.51 |
| Language modeling | Text Pretrained | ASR-WER. | 10.34 |
Best Implementation
- Selected ictnlp/llama-omni2 as the strongest maintained implementation for new work.
- Includes dependency/environment manifest signals.
- Repository activity is within the last 24 months.
Reproduction Path
- 1
Start with ictnlp/llama-omni2 and validate setup instructions in README.
- 2
Reproduce the baseline result with the provided defaults before modifying hyperparameters.
- 3
Log exact dependency versions and runtime environment for reproducibility.
Time to first repro: a few hoursLicense metadata missingNo CI workflows detected
Additional Implementations
No additional verified repositories beyond the primary recommendation.
Hugging Face Artifacts
No direct paper-linked artifacts were found. Showing strongest curated related artifacts.
Curated Related