How reproducible is "Fast Sparse Decision Tree Optimization via Reference Ensembles"?

Estimated time to first reproduction: a few hours. Risk flags: No maintained paper-verified implementation is currently available. This is primarily a method paper. Reproduce it within a maintained framework baseline instead of chasing paper-specific repos.

Fast Sparse Decision Tree Optimization via Reference Ensembles

Hayden McTavish, Chudi Zhong, Reto Achermann, Ilias Karimalis, Jacques Chen, Cynthia Rudin, Margo Seltzer

Published: Jun 28, 2022

No direct implementation yet

Evidence: Inferred

Domain fit: Niche / domain-specific

Verified repos: 0

No strong AI-core implementation/artifact signals were detected from current providers.

Time to first repro: a few hours

1 risk flag

DOI Publisher

Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have been made on the problem only within the past few years, primarily on the problem of finding optimal sparse decision trees. Ho ...

Read full abstract

wever, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. The guesses come from knowledge gleaned from black box models. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: <i>when you are having trouble optimizing</i>, <i>just guess</i>.

Technical details

Canonical key: doi-10.1609_aaai.v36i9.21194

Cache status: Stale (SWR served)

Generated at: Jun 18, 2026, 10:03 PM

Artifact coverage: sparse

HF provider: ok (mixed)

PWC source used: No

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

context only

Benchmarks: missing

Time to repro: a few hours

1 risk flag

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning.

Implementation Evidence Summary

Confidence: low

This is primarily a method paper. Reproduce it within a maintained framework baseline instead of chasing paper-specific repos.

Reproduction Risks

No maintained paper-verified implementation is currently available

Evidence disclosure

Evidence graph: 2 refs, 1 links.

Utility signals: depth 60/100, grounding 58/100, status medium.

Implementation Status

No verified maintained repo

There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.

This is primarily a method paper. Reproduce it within a maintained framework baseline instead of chasing paper-specific repos.
Start with framework-native implementations (e.g. PyTorch optimizer module, Optax, or Transformers training loops).
Replicate the paper ablation settings first, then compare against modern baselines.

Time to first repro: a few hours

Reproduction readiness

No Repo

Time to first repro: hours

Last checked: Jun 18, 2026

No verified implementation available

· No maintained repository has been identified for this paper. Check adjacent implementations or HF artifacts below.

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.

Hugging Face artifacts

No trustworthy direct or curated related Hugging Face artifacts were found yet.

Continue with targeted Hugging Face searches derived from the paper title and method context:

Models

Optimization problem Decision tree Optimization problem model

Datasets

Decision tree dataset Optimization problem benchmark Optimization problem dataset

Spaces

Decision tree demo Optimization problem gradio Optimization problem demo

Tip: start with models, then check datasets/spaces if you need evaluation data or demos.

Direct artifact matches are currently sparse. Use targeted Hugging Face searches to quickly locate candidate models, datasets, and demos.

Search models Search datasets Search spaces

Research context

Citations

References

Tasks

Computer science, Decision tree, Tree (set theory), Black box, Incremental decision tree, Decision tree learning, Physical Sciences

Methods

Optimization problem, Mathematical optimization, Algorithm

Domains

Machine learning, Artificial intelligence

Evaluation & Human Feedback Data

Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.

Open in HFEPX

Explore Similar Papers

Jump to Paper2Code search queries derived from this paper's research context.

Computer science Decision tree Tree (set theory) Black box Incremental decision tree Decision tree learning