OpenTrain AI
Maintained implementation availabletfPretrained Models Available

Efficient Post-training Quantization with FP8 Formats

September 1, 2023arXiv: 2309.14592
2 repos2,609 stars~a few hours to reproduce
arXiv PDF

Abstract

Results & Benchmarks

TaskDatasetMetricValue
QuantizationBert-BaseE5M20.9040
QuantizationBert-LargeE5M20.6968
QuantizationResNet-50E5M20.7544
QuantizationDenseNet-121E5M20.7435
QuantizationWav2Vec2E5M20.9632
QuantizationFunnelE5M20.9215

Best Implementation

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

2.6k 302 Apr 2026 Apache-2.0
License
CI
Deps
Docker
  • Selected intel/neural-compressor as the strongest maintained implementation for new work.
  • Includes CI workflow signals.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.

Reproduction Path

  1. 1

    Start with intel/neural-compressor and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hoursNo repository-level red flags were detected, but paper-specific preprocessing and hyperparameter details may still be under-specified.

Additional Implementations

No additional verified repositories beyond the primary recommendation.

Hugging Face Artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts.

Research Context