OpenTrain AI
Maintained implementation availablepytorchPretrained Models Available

QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions

October 1, 2019arXiv: 1910.10261
1 repo17,068 stars~a few hours to reproduce
arXiv PDF

Abstract

Results & Benchmarks

TaskDatasetMetricValue
Deep Automatic Speech Recognition 1d Time-channelLibriSpeechWER18.9
Deep Automatic Speech Recognition 1d Time-channelWSJWER26.5

Best Implementation

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

17.1k 3.4k Apr 2026 Apache-2.0
License
CI
Deps
Docker
  • Selected NVIDIA/NeMo as the strongest maintained implementation for new work.
  • Includes CI workflow signals.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.

Reproduction Path

  1. 1

    Start with NVIDIA/NeMo and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hoursNo repository-level red flags were detected, but paper-specific preprocessing and hyperparameter details may still be under-specified.

Additional Implementations

No additional verified repositories beyond the primary recommendation.

Hugging Face Artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts.