Generalized test utilities for long-tail performance in extreme multi-label classification

Q: What is the best open-source implementation of "Generalized test utilities for long-tail performance in extreme multi-label classification"?

The best maintained implementation is mwydmuch/xcolumns with 2 stars on GitHub. Confidence: high. Reproducibility: Strong.

Q: How reproducible is "Generalized test utilities for long-tail performance in extreme multi-label classification"?

Estimated time to first reproduction: a few hours. Risk flags: Top repository has low community adoption. Start with mwydmuch/xcolumns and validate setup instructions in README.

Q: What framework is used to implement "Generalized test utilities for long-tail performance in extreme multi-label classification"?

The primary implementation uses pytorch.

Published: Nov 1, 2023

Best maintained implementation now

Evidence: Direct

Domain fit: AI-adjacent

Verified repos: 1

Top repo stars: 2

Paper appears method- or tooling-adjacent to AI workflows with partial ecosystem coverage.

Framework: pytorch

Time to first repro: a few hours

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-2311.05081

Cache status: Fresh

Generated at: Jun 19, 2026, 4:07 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: missing

Time to repro: a few hours

1 risk flag

pytorch

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Generalized test utilities for long-tail performance in extreme multi-label classification is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

mwydmuch/xcolumns is the strongest maintained implementation based on ranking signals. CI workflows are present. License is declared (MIT).

Open mwydmuch/xcolumns

Reproduction Risks

Top repository has low community adoption

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 55/100, grounding 75/100, status medium.

Implementation Comparison

Top 1 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

mwydmuch/xcolumns

best maintained

Maintenance: Recently updated

Confidence: High

Reproducibility: Strong

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 2
Last push: Jan 22, 2026 (149d ago)

CIReleasesDependencies

Risk flags

No Docker setup

Best implementation now

mwydmuch/xcolumns

Confidence: High

Reproducibility: Strong

Consistent Optimization of Label-wise Utilities in Multi-label classificatioN

Stars: 2

Forks: 2

Last push: Jan 22, 2026

License: MIT

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

License ✓

CI ✓

Deps ✓

Docker –

Selected mwydmuch/xcolumns as the strongest maintained implementation for new work.
Includes CI workflow signals.
Includes dependency/environment manifest signals.
Repository activity is within the last 24 months.

Reproduction readiness

Ready to Run

Time to first repro: hours

Last checked: Jun 19, 2026

Ready to reproduce

· Clone mwydmuch/xcolumns and install dependencies from pyproject.toml.
· CI pipeline detected — automated tests are in place.
· Last updated 149 days ago.

Open mwydmuch/xcolumns

Quick start

git clone https://github.com/mwydmuch/xcolumns.git
pip install -e .

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

No trustworthy model matches right now.

Search models on Hugging Face

Datasets

asrar7787/dataset_mistral_instruct_commerce_operations-performance_best_practices_v1

Curated Related

Downloads: 454

Likes: 1

Updated: Dec 5, 2023
mstz/student_performance

Curated Related

Downloads: 126

Likes: 3

Updated: Oct 8, 2025

Broaden dataset search

generalized test utilities long dataset

Spaces

Penguni/Analyze-and-predict-student-performance

Curated Related

Likes: 3
bardsai/performance-llm-board

Curated Related

Likes: 3

Broaden demo search

generalized test utilities long demo

Explore on Hugging Face

Search models Search datasets Search spaces

Research context

Evaluation & Human Feedback Data

Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.

Open in HFEPX

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.

Post a Job Get a Quote