- Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning
Julian Minder, Clément Dumas, Caden Juang, Bilal Chugtai, Neel Nanda · Apr 3, 2025 · Citations: 0
Pairwise Preference
Using the BatchTopK crosscoder, we successfully identify a set of chat-specific latents that are both interpretable and causally effective, representing concepts such as false information and personal question, along with multiple…
- AnesSuite: A Comprehensive Benchmark and Dataset Suite for Anesthesiology Reasoning in LLMs
Xiang Feng, Wentao Jiang, Zengmao Wang, Yong Luo, Pingbo Xu · Apr 3, 2025 · Citations: 0
- Chain of Correction for Full-text Speech Recognition with Large Language Models
Zhiyuan Tang, Dong Wang, Zhikai Zhou, Yong Liu, Shen Huang · Apr 2, 2025 · Citations: 0
- Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning
Philipp Mondorf, Shijia Zhou, Monica Riedler, Barbara Plank · Apr 2, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
Xiaoke Huang, Juncheng Wu, Hui Liu, Xianfeng Tang, Yuyin Zhou · Apr 1, 2025 · Citations: 0
Our evaluation across diverse medical tasks demonstrates that test-time scaling consistently enhances medical reasoning, enabling lightweight fine-tuned models under 10B parameters to establish new state-of-the-art performance, while our…
- A Scalable Framework for Evaluating Health Language Models
Neil Mallinar, A. Ali Heydari, Xin Liu, Anthony Z. Faranesh, Brent Winslow · Mar 30, 2025 · Citations: 0
Rubric RatingExpert Verification
As LLM-driven health applications are increasingly adopted, rigorous and efficient one-sided evaluation methodologies are crucial to ensure response quality across multiple dimensions, including accuracy, personalization and safety.
- More Bang for the Buck: Process Reward Modeling with Entropy-Driven Uncertainty
Lang Cao, Renhong Chen, Yingtian Zou, Chao Peng, Huacong Xu · Mar 28, 2025 · Citations: 0
Unlike previous Process Reward Models (PRMs) that rely on static partitioning and human labeling, EDU-PRM automatically anchors step boundaries at tokens with high predictive entropy, effectively capturing intrinsic logical transitions and…
- Lean Formalization of Generalization Error Bound by Rademacher Complexity and Dudley's Entropy Integral
Sho Sonoda, Kazumi Kasaura, Yuma Mizuno, Kei Tsukamoto, Naoto Onda · Mar 25, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- EconEvals: Benchmarks and Litmus Tests for Economic Decision-Making by LLM Agents
Sara Fish, Julia Shephard, Minkai Li, Ran I. Shorrer, Yannai A. Gonczarowski · Mar 24, 2025 · Citations: 0
We develop evaluation methods for measuring the economic decision-making capabilities and tendencies of LLMs.