- Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalization
Qiyao Ma, Dechen Gao, Rui Cai, Boqi Zhao, Hanchu Zhou · Apr 8, 2026 · Citations: 0
Pairwise PreferenceRubric Rating Human EvalAutomatic Metrics
Pluralistic alignment has emerged as a critical frontier in the development of Large Language Models (LLMs), with reward models (RMs) serving as a central mechanism for capturing diverse human values.
- TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectories
Yen-Shan Chen, Sian-Yao Huang, Cheng-Lin Yang, Yun-Nung Chen · Apr 8, 2026 · Citations: 0
Red Team Automatic Metrics Long Horizon
As large language models (LLMs) evolve from static chatbots into autonomous agents, the primary vulnerability surface shifts from final outputs to intermediate execution traces.
- RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale
Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari, Michael Gentile · Apr 2, 2026 · Citations: 0
Expert Verification Llm As JudgeAutomatic Metrics
This paper focuses on RuleForge's architecture and operational deployment for CVE-related threat detection, with particular emphasis on our novel LLM-as-a-judge (Large Language Model as judge) confidence validation system and systematic…
- Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers
Atsuyuki Miyai, Mashiro Toyooka, Zaiying Zhao, Kenta Watanabe, Toshihiko Yamasaki · Apr 1, 2026 · Citations: 0
Rubric Rating Automatic Metrics
We introduce Paper Reconstruction Evaluation (PaperRecon), an evaluation framework in which an overview (overview.md) is created from an existing paper, after which an agent generates a full paper based on the overview and minimal…
- ReDAct: Uncertainty-Aware Deferral for LLM Agents
Dzianis Piatrashyn, Nikita Kotelevskii, Kirill Grishchenkov, Nikita Glazkov, Ivan Nasonov · Apr 8, 2026 · Citations: 0
Simulation Env Long Horizon
Recently, LLM-based agents have become increasingly popular across many applications, including complex sequential decision-making problems.
- Do Phone-Use Agents Respect Your Privacy?
Zhengyang Tang, Ke Ji, Xidong Wang, Zihan Ye, Xinyuan Wang · Apr 1, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
We study whether phone-use agents respect privacy while completing benign mobile tasks.
- LUDOBENCH: Evaluating LLM Behavioural Decision-Making Through Spot-Based Board Game Scenarios in Ludo
Ojas Jain, Dhruv Kumar · Apr 7, 2026 · Citations: 0
Simulation Env Multi Agent
We introduce LudoBench, a benchmark for evaluating LLM strategic reasoning in Ludo, a stochastic multi-agent board game whose dice mechanics, piece capture, safe-square navigation, and home-path progression introduce meaningful planning…
- Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization
He Du, Qiming Ge, Jiakai Hu, Aijun Yang, Zheng Cai · Mar 30, 2026 · Citations: 0
Critique Edit Long Horizon
We present Kernel-Smith, a framework for high-performance GPU kernel and operator generation that combines a stable evaluation-driven evolutionary agent with an evolution-oriented post-training recipe.
- Wiggle and Go! System Identification for Zero-Shot Dynamic Rope Manipulation
Arthur Jakobsson, Abhinav Mahajan, Karthik Pullalarevu, Krishna Suresh, Yunchao Yao · Apr 23, 2026 · Citations: 0
Automatic MetricsSimulation Env Long Horizon
To mitigate this, we present a novel approach that leverages learned simulation priors to inform goal-conditioned dynamic manipulation of ropes for efficient and accurate task execution.
- QED-Nano: Teaching a Tiny Model to Prove Hard Theorems
LM-Provers, Yuxiao Qu, Amrith Setlur, Jasper Dekoninck, Edward Beeching · Apr 6, 2026 · Citations: 0
Rubric Rating Automatic Metrics
To support further research on open mathematical reasoning, we release the full QED-Nano pipeline, including the QED-Nano and QED-Nano-SFT models, the FineProofs-SFT and FineProofs-RL datasets, and the training and evaluation code.
- MemRerank: Preference Memory for Personalized Product Reranking
Zhiyuan Peng, Xuyang Wu, Huaixiao Tou, Yi Fang, Yu Gong · Mar 31, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
LLM-based shopping agents increasingly rely on long purchase histories and multi-turn interactions for personalization, yet naively appending raw history to prompts is often ineffective due to noise, length, and relevance mismatch.
- Large Language Model Post-Training: A Unified View of Off-Policy and On-Policy Learning
Shiwan Zhao, Zhihu Wang, Xuyang Zhao, Jiaming Zhou, Caiyue Xu · Apr 9, 2026 · Citations: 0
Pairwise Preference Long Horizon
Recent progress spans supervised fine-tuning (SFT), preference optimization, reinforcement learning (RL), process supervision, verifier-guided methods, distillation, and multi-stage pipelines.
- FlowForge: A Staged Local Rollout Engine for Flow-Field Prediction
Xiaowen Zhang, Ziming Zhou, Fengnian Zhao, David L. S. Hung · Apr 21, 2026 · Citations: 0
Automatic Metrics Long Horizon
We introduce FlowForge, a staged local rollout engine that predicts future flow fields by compiling a locality-preserving update schedule and executing it with a shared lightweight local predictor.
- S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models
Jack Young · Apr 1, 2026 · Citations: 0
Automatic Metrics Long Horizon
Using roughly 48 execution-verified HumanEval training solutions, tuning a single initial state matrix per recurrent layer, with zero inference overhead, outperforms LoRA by +10.8 pp (p < 0.001) on HumanEval.
- Training-Free Dynamic Upcycling of Expert Language Models
Eros Fanì, Oğuzhan Ersoy · Mar 31, 2026 · Citations: 0
Expert Verification
To address these issues, we introduce Dynamic Upcycling MoE (DUME), a novel approach that reuses dense experts trained on different domains to construct a unified MoE model.
- ActionParty: Multi-Subject Action Binding in Generative Video Games
Alexander Pondaven, Ziyi Wu, Igor Gilitschenski, Philip Torr, Sergey Tulyakov · Apr 2, 2026 · Citations: 0
Automatic MetricsSimulation Env Multi Agent
However, these models are largely restricted to single-agent settings, failing to control multiple agents simultaneously in a scene.
- Reliable Self-Harm Risk Screening via Adaptive Multi-Agent LLM Systems
Meghana Karnam, Ananya Joshi · Apr 24, 2026 · Citations: 0
Llm As Judge Long Horizon
Emerging AI systems in behavioral health and psychiatry use multi-step or multi-agent LLM pipelines for tasks like assessing self-harm risk and screening for depression.
- Weakly Supervised Distillation of Hallucination Signals into Transformer Representations
Shoaib Sadiq Salehmohamed, Jinal Prashant Thakkar, Hansika Aredla, Shaik Mohammed Omar, Shalmali Ayachit · Apr 7, 2026 · Citations: 0
Llm As JudgeAutomatic Metrics
We introduce a weak supervision framework that combines three complementary grounding signals: substring matching, sentence embedding similarity, and an LLM as a judge verdict to label generated responses as grounded or hallucinated without…
- SkillX: Automatically Constructing Skill Knowledge Bases for Agents
Chenxi Wang, Zhuoyun Yu, Xin Xie, Wuguannan Yao, Runnan Fang · Apr 6, 2026 · Citations: 0
Automatic Metrics Long Horizon
Learning from experience is critical for building capable large language model (LLM) agents, yet prevailing self-evolving paradigms remain inefficient: agents learn in isolation, repeatedly rediscover similar behaviors from limited…
- SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning
Zhengyang Ai, Zikang Shan, Xiaodong Ai, Jingxian Tang, Hangkai Hu · Apr 8, 2026 · Citations: 0
Automatic Metrics Long Horizon
Extensive experiments in math reasoning across three base models and five benchmarks demonstrate that SHAPE achieves an average accuracy gain of 3% with 30% reduced token consumption.
- Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing
Gengsheng Li, Tianyu Yang, Junfeng Fang, Mingyang Song, Mao Zheng · Apr 2, 2026 · Citations: 0
Automatic Metrics Long Horizon
Evaluated across five benchmarks and two model scales, SRPO achieves both the rapid early improvement of SDPO and the long-horizon stability of GRPO.
- LEO: Graph Attention Network based Hybrid Multi Sensor Extended Object Fusion and Tracking for Autonomous Driving Applications
Mayank Mayank, Bharanidhar Duraisamy, Florian Geiss · Apr 2, 2026 · Citations: 0
Automatic Metrics Long Horizon
Evaluations on the Mercedes-Benz DRIVE PILOT SAE L3 dataset demonstrate real-time computational efficiency suitable for production systems; additional validation on public datasets such as View of Delft (VoD) further confirms cross-dataset…
- Selecting Decision-Relevant Concepts in Reinforcement Learning
Naveen Raman, Stephanie Milani, Fei Fang · Apr 6, 2026 · Citations: 0
Expert Verification
Training interpretable concept-based policies requires practitioners to manually select which human-understandable concepts an agent should reason with when making sequential decisions.
- FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models
Juyong Jiang, Fan Wang, Hong Qi, Sunghun Kim, Jing Tang · Apr 2, 2026 · Citations: 0
Expert Verification
Extensive evaluations across 28 benchmarks, multiple model architectures, and scales demonstrate that FourierMoE consistently outperforms competitive baselines in both single-task and multi-task settings while using significantly fewer…
- A Survey of On-Policy Distillation for Large Language Models
Mingyang Song, Mao Zheng · Apr 1, 2026 · Citations: 0
Expert VerificationDemonstrations
We systematically analyze representative methods, examine industrial deployments, and identify open problems including distillation scaling laws, uncertainty-aware feedback, and agent-level distillation.
- From High-Dimensional Spaces to Verifiable ODD Coverage for Safety-Critical AI-based Systems
Thomas Stefani, Johann Maximilian Christensen, Elena Hoemann, Frank Köster, Sven Hallerbach · Apr 2, 2026 · Citations: 0
Simulation Env Long Horizon
While Artificial Intelligence (AI) offers transformative potential for operational performance, its deployment in safety-critical domains such as aviation requires strict adherence to rigorous certification standards.
- Learning to Play Blackjack: A Curriculum Learning Perspective
Amirreza Alasti, Efe Erdal, Yücel Celik, Theresa Eimer · Mar 31, 2026 · Citations: 0
Automatic MetricsSimulation Env
We propose a novel framework that uses a Large Language Model (LLM) to dynamically generate a curriculum over available actions, enabling the agent to incorporate each action individually.
- The Detection-Extraction Gap: Models Know the Answer Before They Can Say It
Hanyang Wang, Mingxuan Zhu · Apr 8, 2026 · Citations: 0
Automatic Metrics Tool Use
Across five model configurations, two families, and three benchmarks, we find that 52--88% of chain-of-thought tokens are produced after the answer is recoverable from a partial prefix.
- Learning-augmented robotic automation for real-world manufacturing
Yunho Kim, Quan Nguyen, Taewhan Kim, Youngjin Heo, Joonho Lee · Apr 24, 2026 · Citations: 0
Demonstrations
Here we present Learning-Augmented Robotic Automation, a hybrid system that integrates learned task controllers and a neural 3D safety monitor into conventional industrial workflows.
- Preserve Support, Not Correspondence: Dynamic Routing for Offline Reinforcement Learning
Zhancun Mu, Guangyu Zhao, Yiwu Zhong, Chi Zhang · Apr 24, 2026 · Citations: 0
Demonstrations
We propose DROL, a latent-conditioned one-step actor trained with top-1 dynamic routing.
- Removing Sandbagging in LLMs by Training with Weak Supervision
Emil Ryd, Henning Bartsch, Julian Stastny, Joe Benton, Vivek Hebbar · Apr 23, 2026 · Citations: 0
Demonstrations
As AI systems begin to automate complex tasks, supervision increasingly relies on weaker models or limited human oversight that cannot fully verify output quality.
- Shared Lexical Task Representations Explain Behavioral Variability In LLMs
Zhuonan Yang, Jacob Xiaochen Li, Francisco Piedrahita Velez, Eric Todd, David Bau · Apr 23, 2026 · Citations: 0
Demonstrations
One of the most common complaints about large language models (LLMs) is their prompt sensitivity -- that is, the fact that their ability to perform a task or provide a correct answer to a question can depend unpredictably on the way the…
- When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs
Pegah Khayatan, Jayneel Parekh, Arnaud Dapogny, Mustafa Shukor, Alasdair Newson · Apr 23, 2026 · Citations: 0
- TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale
Jun Wang, Ziyin Zhang, Rui Wang, Hang Yu, Peng Di · Apr 23, 2026 · Citations: 0
- StructMem: Structured Memory for Long-Horizon Behavior in LLMs
Buqiang Xu, Yijun Chen, Jizhan Fang, Ruobin Zhong, Yunzhi Yao · Apr 23, 2026 · Citations: 0
- Fixation Sequences as Time Series: A Topological Approach to Dyslexia Detection
Marius Huber, David R. Reich, Lena A. Jäger · Apr 23, 2026 · Citations: 0
- Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning
Hanjun Cho, Gahyun Yoo, Hanseong Kim, Jay-Yoon Lee · Apr 23, 2026 · Citations: 0
- Cross-Domain Data Selection and Augmentation for Automatic Compliance Detection
Fariz Ikhwantri, Dusica Marijan · Apr 23, 2026 · Citations: 0
- mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code
Adam Skurla, Dominik Macko, Jakub Simko · Apr 23, 2026 · Citations: 0
- Beyond Single Plots: A Benchmark for Question Answering on Multi-Charts
Azher Ahmed Efat, Seok Hwan Song, Wallapak Tavanapong · Apr 23, 2026 · Citations: 0
- Sub-Token Routing in LoRA for Adaptation and Query-Aware KV Compression
Wei Jiang, Wei Wang · Apr 23, 2026 · Citations: 0
- Ideological Bias in LLMs' Economic Causal Reasoning
Donggyu Lee, Hyeok Yun, Jungwon Kim, Junsik Min, Sungwon Park · Apr 23, 2026 · Citations: 0
- Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning
Yongcan Yu, Lingxiao He, Jian Liang, Kuangpu Guo, Meng Wang · Apr 23, 2026 · Citations: 0
- Explainable Disentangled Representation Learning for Generalizable Authorship Attribution in the Era of Generative AI
Hieu Man, Van-Cuong Pham, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen · Apr 23, 2026 · Citations: 0
- Cross-Entropy Is Load-Bearing: A Pre-Registered Scope Test of the K-Way Energy Probe on Bidirectional Predictive Coding
Jon-Paul Cacioli · Apr 23, 2026 · Citations: 0
- Hyperloop Transformers
Abbas Zeitoun, Lucas Torroba-Hennigen, Yoon Kim · Apr 23, 2026 · Citations: 0
- Learning Dynamic Representations and Policies from Multimodal Clinical Time-Series with Informative Missingness
Zihan Liang, Ziwen Pan, Ruoxuan Xiong · Apr 23, 2026 · Citations: 0
- Adaptive Instruction Composition for Automated LLM Red-Teaming
Jesse Zymet, Andy Luo, Swapnil Shinde, Sahil Wadhwa, Emily Chen · Apr 22, 2026 · Citations: 0
- Slot Machines: How LLMs Keep Track of Multiple Entities
Paul C. Bogdan, Jack Lindsey · Apr 22, 2026 · Citations: 0
- Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms
Ari Azarafrooz · Apr 22, 2026 · Citations: 0
- TabSHAP
Aryan Chaudhary, Prateek Agarwal, Tejasvi Alladi · Apr 22, 2026 · Citations: 0
- How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models
Kristian Schwethelm, Daniel Rueckert, Georgios Kaissis · Apr 22, 2026 · Citations: 0
- Weighting What Matters: Boosting Sample Efficiency in Medical Report Generation via Token Reweighting
Alexander Weers, Daniel Rueckert, Martin J. Menten · Apr 22, 2026 · Citations: 0
- DWTSumm: Discrete Wavelet Transform for Document Summarization
Rana Salama, Abdou Youssef, Mona Diab · Apr 22, 2026 · Citations: 0
- Convergent Evolution: How Different Language Models Learn Similar Number Representations
Deqing Fu, Tianyi Zhou, Mikhail Belkin, Vatsal Sharan, Robin Jia · Apr 22, 2026 · Citations: 0
- Working Memory Constraints Scaffold Learning in Transformers under Data Scarcity
Pranava Madhyastha, Dagmar Adamcova · Apr 22, 2026 · Citations: 0
- COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling
Noah Flynn · Apr 22, 2026 · Citations: 0
- Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confidence-Weighted, and Relational Knowledge
Naizhong Xu · Apr 22, 2026 · Citations: 0
- CHASM: Unveiling Covert Advertisements on Chinese Social Media
Jingyi Zheng, Tianyi Hu, Yule Liu, Zhen Sun, Zongmin Zhang · Apr 22, 2026 · Citations: 0
- MOMO: A framework for seamless physical, verbal, and graphical robot skill learning and adaptation
Markus Knauer, Edoardo Fiorini, Maximilian Mühlbauer, Stefan Schneyer, Promwat Angsuratanawech · Apr 22, 2026 · Citations: 0