Results & Benchmarks
| Task | Dataset | Metric | Value |
|---|---|---|---|
| Deep Automatic Speech Recognition 1d Time-channel | LibriSpeech | WER | 18.9 |
| Deep Automatic Speech Recognition 1d Time-channel | WSJ | WER | 26.5 |
Best Implementation
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
17.1k 3.4k Apr 2026 Apache-2.0
License ✓
CI ✓
Deps ✓
Docker –
- Selected NVIDIA/NeMo as the strongest maintained implementation for new work.
- Includes CI workflow signals.
- Includes dependency/environment manifest signals.
- Repository activity is within the last 24 months.
Reproduction Path
- 1
Start with NVIDIA/NeMo and validate setup instructions in README.
- 2
Reproduce the baseline result with the provided defaults before modifying hyperparameters.
- 3
Log exact dependency versions and runtime environment for reproducibility.
Time to first repro: a few hoursNo repository-level red flags were detected, but paper-specific preprocessing and hyperparameter details may still be under-specified.
Additional Implementations
No additional verified repositories beyond the primary recommendation.
Hugging Face Artifacts
No direct paper-linked artifacts were found. Showing strongest curated related artifacts.
Curated Related