OpenTrain AI
Maintained implementation availablepytorchPretrained Models Available

Dilated Neighborhood Attention Transformer

September 1, 2022arXiv: 2209.15001
3 repos733 stars~a few hours to reproduce
arXiv PDF

Abstract

Results & Benchmarks

TaskDatasetMetricValue
TransformerImageNetTop-1 Accuracy83.2

Best Implementation

Fast Multi-dimensional Sparse Attention

733 58 Apr 2026 MIT
License โœ“
CI โ€“
Deps โœ“
Docker โœ“
  • Selected shi-labs/natten as the strongest maintained implementation for new work.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.
  • Official repository is preserved separately as historical context.

Reproduction Path

  1. 1

    Start with shi-labs/natten and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hoursNo CI workflows detected

Additional Implementations

Official

No additional official repositories detected.

Community

  • SHI-Labs/NATTENConfidence: low

    Fast Multi-dimensional Sparse Attention

    Stars: 733Forks: 58Last push: Apr 2026License: MIT

Hugging Face Artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts.

Research Context