OpenTrain AI
Maintained implementation availablePretrained Models Available

NeuCo-Bench: A Novel Benchmark Framework for Neural Embeddings in Earth Observation

Rikard Vinge, Isabelle Wittmann, Jannik Schneider, Michael Marszalek, Luis Gilch +2 more

October 19, 2025arXiv: 2510.17914
1 repo25 stars~a few hours to reproduce
arXiv PDF

Abstract

We introduce NeuCo-Bench, a novel benchmark framework for evaluating (lossy) neural compression and representation learning in the context of Earth Observation (EO). Our approach builds on fixed-size embeddings that act as compact, task-agnostic representations applicable to a broad range of downstream tasks. NeuCo-Bench comprises three components: (i) an evaluation pipeline built around embeddings, (ii) a challenge...

Summary

NeuCo-Bench is introduced as a benchmark framework with three components: an embedding-centric evaluation pipeline, a hidden-task challenge leaderboard mitigating pretraining bias, and a scoring system balancing accuracy and stability. This page includes benchmark evidence for Novel Benchmark Framework Neural Embeddings Earth on Crops. Reproduction guidance focuses on implementation viability and concrete risk controls.

Key Contributions

  • NeuCo-Bench is introduced as a benchmark framework with three components: an embedding-centric evaluation pipeline, a hidden-task challenge leaderboard mitigating pretraining bias, and a scoring system balancing.
  • The framework relies on fixed-size neural embeddings that serve as compact, lossy, task-agnostic representations for a broad range of Earth Observation downstream tasks.
  • To support reproducible evaluation, the authors release SSL4EO-S12-downstream, a curated multispectral and multitemporal Earth Observation dataset tailored for NeuCo-Bench downstream tasks.
  • NeuCo-Bench provides an initial suite of downstream tasks built on data cubes such as Crops and Landcover, each with defined spatial coverage, temporal coverage, label years, and task counts.
  • The benchmark includes a comparison of temporal aggregation strategies, evaluating pre-encoding versus post-encoding aggregation using average R²-based metrics across all downstream tasks.

Implementation Guidance

No direct implementation is fully reliable. Follow the extracted setup and evaluation blueprint, then validate assumptions against benchmark evidence before claiming parity.

Reproducibility Notes

  • No repository-level reproducibility signals are currently available.
  • Estimate is based on paper-only reproduction flow.
  • No direct maintained implementation is currently verified.

Results & Benchmarks

TaskDatasetMetricValue
Novel Benchmark Framework Neural Embeddings EarthCropsTemporal Coverage2022
Novel Benchmark Framework Neural Embeddings EarthLandcoverTemporal Coverage2018

Best Implementation

Welcome to NeuCo-Bench, a benchmarking framework for evaluating compressed embeddings on downstream tasks.

25 2 Feb 2026 Apache-2.0
License
CI
Deps
Docker
  • Selected embed2scale/NeuCo-Bench as the strongest maintained implementation for new work.
  • Includes CI workflow signals.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.

Reproduction Path

  1. 1

    Start with embed2scale/NeuCo-Bench and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few hoursNo repository-level red flags were detected, but paper-specific preprocessing and hyperparameter details may still be under-specified.

Additional Implementations

No additional verified repositories beyond the primary recommendation.

Hugging Face Artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts.

Curated Related