Fast Sparse Decision Tree Optimization via Reference Ensembles
Hayden McTavish, Chudi Zhong, Reto Achermann, Ilias Karimalis, Jacques Chen, Cynthia Rudin, Margo Seltzer
No strong AI-core implementation/artifact signals were detected from current providers.
Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have been made on the problem only within the past few years, primarily on the problem of finding optimal sparse decision trees. Ho ...
wever, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. The guesses come from knowledge gleaned from black box models. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: <i>when you are having trouble optimizing</i>, <i>just guess</i>.
Results & Benchmarks
No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.
Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning.
Implementation Evidence Summary
This is primarily a method paper. Reproduce it within a maintained framework baseline instead of chasing paper-specific repos.
Reproduction Risks
- No maintained paper-verified implementation is currently available
Evidence disclosure
Evidence graph: 2 refs, 1 links.
Utility signals: depth 60/100, grounding 58/100, status medium.
Implementation Status
There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.
- This is primarily a method paper. Reproduce it within a maintained framework baseline instead of chasing paper-specific repos.
- Start with framework-native implementations (e.g. PyTorch optimizer module, Optax, or Transformers training loops).
- Replicate the paper ablation settings first, then compare against modern baselines.
Reproduction readiness
No verified implementation available
- · No maintained repository has been identified for this paper. Check adjacent implementations or HF artifacts below.
No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.
Hugging Face artifacts
No trustworthy direct or curated related Hugging Face artifacts were found yet.
Continue with targeted Hugging Face searches derived from the paper title and method context:
Tip: start with models, then check datasets/spaces if you need evaluation data or demos.
Direct artifact matches are currently sparse. Use targeted Hugging Face searches to quickly locate candidate models, datasets, and demos.
Research context
27
Citations
51
References
Tasks
Computer science, Decision tree, Tree (set theory), Black box, Incremental decision tree, Decision tree learning, Physical Sciences
Methods
Optimization problem, Mathematical optimization, Algorithm
Domains
Machine learning, Artificial intelligence
Evaluation & Human Feedback Data
Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.
Open in HFEPXExplore Similar Papers
Jump to Paper2Code search queries derived from this paper's research context.
Related papers
-
Search on Paper2Code
The Analysis and Application of the C4.5 Algorithm in Decision Tree Technology (2012) Semantic similarity
-
Search on Paper2Code
Survey on Data Mining Decision Tree Algorithms in Classification Techniques (2017) Semantic similarity
-
Search on Paper2Code
Impacts of Evaluation Methods on Classification Algorithm s Accuracy (2021) Semantic similarity
-
Search on Paper2Code
Research on Decision Tree Algorithm Based on Information Entropy (2011) Semantic similarity
-
Search on Paper2Code
Data-driven decision tree learning algorithm based on rough set theory (2005) Semantic similarity
-
Search on Paper2Code
The Analysis and Application of the C4.5 Algorithm in Decision Tree Technology (2012) Semantic similarity
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.