- Vega: Learning to Drive with Natural Language Instructions
Sicheng Zuo, Yuxuan Li, Wenzhao Zheng, Zheng Zhu, Jie Zhou · Mar 26, 2026 · Citations: 0
- Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
Zehao Wang, Huaide Jiang, Shuaiwu Dong, Yuping Wang, Hang Qiu · Mar 26, 2026 · Citations: 0
- Training the Knowledge Base through Evidence Distillation and Write-Back Enrichment
Yuxing Lu, Xukai Zhao, Wei Wu, Jinzhuo Wang · Mar 26, 2026 · Citations: 0
Across four RAG methods, six benchmarks, and two LLM backbones, WriteBack-RAG improves every evaluated setting, with gains averaging +2.14%.
- PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference
Xiaofeng Mao, Shaohao Rui, Kaining Ying, Bo Zheng, Chuanhao Li · Mar 26, 2026 · Citations: 0
- PixelSmile: Toward Fine-Grained Facial Expression Editing
Jiabin Hua, Hengyuan Xu, Aojie Li, Wei Cheng, Gang Yu · Mar 26, 2026 · Citations: 0
- Back to Basics: Revisiting ASR in the Age of Voice Agents
Geeyang Tay, Wentao Ma, Jaewon Lee, Yuzhi Tang, Daniel Lee · Mar 26, 2026 · Citations: 0
- Natural-Language Agent Harnesses
Linyue Pan, Lexiao Zou, Shuo Guo, Jingchen Ni, Hai-Tao Zheng · Mar 26, 2026 · Citations: 0
Agent performance increasingly depends on harness engineering, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object.
- R-C2: Cycle-Consistent Reinforcement Learning Improves Multimodal Reasoning
Zirui Zhang, Haoyu Dong, Kexin Pei, Chengzhi Mao · Mar 26, 2026 · Citations: 0
- Agent Factories for High Level Synthesis: How Far Can General-Purpose Coding Agents Go in Hardware Optimization?
Abhishek Bhandwaldar, Mihir Choudhury, Ruchir Puri, Akash Srivastava · Mar 26, 2026 · Citations: 0
- Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
Kaijin Chen, Dingkang Liang, Xin Zhou, Yikang Ding, Xiaoqiang Liu · Mar 26, 2026 · Citations: 0
- S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation
Ligong Han, Hao Wang, Han Gao, Kai Xu, Akash Srivastava · Mar 26, 2026 · Citations: 0
Long Horizon
We present S2D2, a training-free self-speculative decoding framework for block-diffusion language models.
- Neural Network Conversion of Machine Learning Pipelines
Man-Ling Sung, Jan Silovsky, Man-Hung Siu, Herbert Gish, Chinnu Pittapally · Mar 26, 2026 · Citations: 0
- The Kitchen Loop: User-Spec-Driven Development for a Self-Evolving Codebase
Yannick Roy · Mar 26, 2026 · Citations: 0
- A Unified Memory Perspective for Probabilistic Trustworthy AI
Xueji Zhao, Likai Pei, Jianbo Liu, Kai Ni, Ningyuan Cao · Mar 26, 2026 · Citations: 0
- Just Zoom In: Cross-View Geo-Localization via Autoregressive Zooming
Yunus Talha Erzurumlu, Jiyong Kwag, Alper Yilmaz · Mar 26, 2026 · Citations: 0
- Self-Improvement of Large Language Models: A Technical Overview and Future Outlook
Haoyan Yang, Mario Xerri, Solha Park, Huajian Zhang, Yiyang Feng · Mar 26, 2026 · Citations: 0
As large language models (LLMs) continue to advance, improving them solely through human supervision is becoming increasingly costly and limited in scalability.
- Measuring What Matters -- or What's Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors
Cole Walsh, Rodica Ivan · Mar 26, 2026 · Citations: 0
These systems commonly achieve performance levels comparable to or superior than trained human raters, but have frequently been demonstrated to be vulnerable to the influence of construct-irrelevant factors (i.e., features of responses that…
- A Mentalistic Interface for Probing Folk-Psychological Attribution to Non-Humanoid Robots
Giulio Pisaneschi, Pierpaolo Serio, Estelle Gerbier, Andrea Dan Ryals, Lorenzo Pollini · Mar 26, 2026 · Citations: 0
- RenoBench: A Citation Parsing Benchmark
Parth Sarin, Juan Pablo Alperin, Adam Buttrick, Dione Mentis · Mar 26, 2026 · Citations: 0
But, despite sustained interest in this problem, existing evaluation techniques are often not generalizable, based on synthetic data, or not publicly available.
- Beyond Via: Analysis and Estimation of the Impact of Large Language Models in Academic Papers
Mingmeng Geng, Yuhang Dong, Thierry Poibeau · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Is Mathematical Problem-Solving Expertise in Large Language Models Associated with Assessment Performance?
Liang Zhang, Yu Fu, Xinyi Jin · Mar 26, 2026 · Citations: 0
- Visual or Textual: Effects of Explanation Format and Personal Characteristics on the Perception of Explanations in an Educational Recommender System
Qurat Ul Ain, Mohamed Amine Chatti, Nasim Yazdian Varjani, Farah Kamal, Astrid Rosenthal-von der Pütten · Mar 26, 2026 · Citations: 0
- PICon: A Multi-Turn Interrogation Framework for Evaluating Persona Agent Consistency
Minseo Kim, Sujeong Im, Junseong Choi, Junhee Lee, Chaeeun Shim · Mar 26, 2026 · Citations: 0
Large language model (LLM)-based persona agents are rapidly being adopted as scalable proxies for human participants across diverse domains.
- Demographic Fairness in Multimodal LLMs: A Benchmark of Gender and Ethnicity Bias in Face Verification
Ünsal Öztürk, Hatef Otroshi Shahreza, Sébastien Marcel · Mar 26, 2026 · Citations: 0
- DeepFAN, a transformer-based deep learning model for human-artificial intelligence collaborative assessment of incidental pulmonary nodules in CT scans: a multi-reader, multi-case trial
Zhenchen Zhu, Ge Hu, Weixiong Tan, Kai Gao, Chao Sun · Mar 26, 2026 · Citations: 0
- TAAC: A gate into Trustable Audio Affective Computing
Xintao Hu, Feng-Qi Cui · Mar 26, 2026 · Citations: 0
- Are LLMs Overkill for Databases?: A Study on the Finiteness of SQL
Yue Li, David Mimno, Unso Eun Seo Jo · Mar 26, 2026 · Citations: 0
- Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes
Yuqian Fu, Haohuan Huang, Kaiwen Jiang, Yuanheng Zhu, Dongbin Zhao · Mar 26, 2026 · Citations: 0
Long Horizon
Across single-task math reasoning and multi-task agentic-plus-math training, this objective yields more stable optimization and better downstream performance than sampled-token OPD.
- Voxtral TTS
Mistral-AI, :, Alexander H. Liu, Alexis Tacnet, Andy Ehrenberg · Mar 26, 2026 · Citations: 0
In human evaluations conducted by native speakers, Voxtral TTS is preferred for multilingual voice cloning due to its naturalness and expressivity, achieving a 68.4\% win rate over ElevenLabs Flash v2.5.
- Humans vs Vision-Language Models: A Unified Measure of Narrative Coherence
Nikolai Ilinykh, Hyewon Jang, Shalom Lappin, Asad Sayeed, Sharid Loáiciga · Mar 26, 2026 · Citations: 0
We study narrative coherence in visually grounded stories by comparing human-written narratives with those generated by vision-language models (VLMs) on the Visual Writing Prompts corpus.
- Synchronous Signal Temporal Logic for Decidable Verification of Cyber-Physical Systems
Partha Roop, Sobhan Chatterjee, Avinash Malik, Nathan Allen, Logan Kenwright · Mar 26, 2026 · Citations: 0
We propose Synchronous Signal Temporal Logic (SSTL), a decidable fragment of STL, which admits static safety and liveness property verification.
- CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild
Alex Hoi Hang Chan, Neha Singhal, Onur Kocahan, Andrea Meltzer, Saverio Lubrano · Mar 26, 2026 · Citations: 0
- NERO-Net: A Neuroevolutionary Approach for the Design of Adversarially Robust CNNs
Inês Valentim, Nuno Antunes, Nuno Lourenço · Mar 26, 2026 · Citations: 0
- Challenges in Hyperspectral Imaging for Autonomous Driving: The HSI-Drive Case
Koldo Basterretxea, Jon Gutiérrez-Zaballa, Javier Echanobe · Mar 26, 2026 · Citations: 0
- Lightweight GenAI for Network Traffic Synthesis: Fidelity, Augmentation, and Classification
Giampaolo Bovenzi, Domenico Ciuonzo, Jonatan Krolikowski, Antonio Montieri, Alfredo Nascita · Mar 26, 2026 · Citations: 0
- An Experimental Comparison of the Most Popular Approaches to Fake News Detection
Pietro Dell'Oglio, Alessandro Bondielli, Francesco Marcelloni, Lucia C. Passaro · Mar 26, 2026 · Citations: 0
We address text-only English fake news detection as a binary classification task by harmonizing labels into "Real" and "Fake" to ensure a consistent evaluation protocol.
- EcoThink: A Green Adaptive Inference Framework for Sustainable and Accessible Agents
Linxiao Li, Zhixiang Lu · Mar 26, 2026 · Citations: 0
- Interpretable PM2.5 Forecasting for Urban Air Quality: A Comparative Study of Operational Time-Series Models
Moazzam Umer Gondal, Hamad ul Qudous, Asma Ahmad Farhan, Sultan Alamri · Mar 26, 2026 · Citations: 0
- Translation Asymmetry in LLMs as a Data Augmentation Factor: A Case Study for 6 Romansh Language Varieties
Jannis Vamvas, Ignacio Pérez Prat, Angela Heldstab, Dominic P. Fischer, Sina Ahmadi · Mar 26, 2026 · Citations: 0
A human evaluation confirms that our experiments yield the first model that generates fluent translations in the individual Romansh varieties.
- Retraining as Approximate Bayesian Inference
Harrison Katz · Mar 26, 2026 · Citations: 0
- Maximum Entropy Behavior Exploration for Sim2Real Zero-Shot Reinforcement Learning
Jiajun Hu, Nuria Armengol Urpi, Jin Cheng, Stelian Coros · Mar 26, 2026 · Citations: 0
- Temporally Decoupled Diffusion Planning for Autonomous Driving
Xiang Li, Bikun Wang, John Zhang, Jianjun Wang · Mar 26, 2026 · Citations: 0
- Cross-Model Disagreement as a Label-Free Correctness Signal
Matt Gorbett, Suman Jana · Mar 26, 2026 · Citations: 0
- From Manipulation to Mistrust: Explaining Diverse Micro-Video Misinformation for Robust Debunking in the Wild
Zhi Zeng, Yifei Yang, Jiaying Wu, Xulang Zhang, Xiangzheng Kong · Mar 26, 2026 · Citations: 0
- Navigating the Prompt Space: Improving LLM Classification of Social Science Texts Through Prompt Engineering
Erkan Gunes, Christoffer Florczak, Tevfik Murat Yildirim · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- TAPO: Translation Augmented Policy Optimization for Multilingual Mathematical Reasoning
Xu Huang, Zhejian Lai, Zixian Huang, Jiajun Chen, Shujian Huang · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Modernising Reinforcement Learning-Based Navigation for Embodied Semantic Scene Graph Generation
Roman Kueble, Marco Hueller, Mrunmai Phatak, Rainer Lienhart, Joerg Haehner · Mar 26, 2026 · Citations: 0
- Decidable By Construction: Design-Time Verification for Trustworthy AI
Houston Haynes · Mar 26, 2026 · Citations: 0
- Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models
Xunguang Wang, Yuguang Zhou, Qingyue Wang, Zongjie Li, Ruixuan Huang · Mar 26, 2026 · Citations: 0
- System Design for Maintaining Internal State Consistency in Long-Horizon Robotic Tabletop Games
Guangyu Zhao, Ceyao Zhang, Chengdong Ma, Tao Wu, Yiyang Song · Mar 26, 2026 · Citations: 0
- Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models
Eyal Hadad, Mordechai Guri · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- A Causal Framework for Evaluating ICU Discharge Strategies
Sagar Nagaraj Simha, Juliette Ortholand, Dave Dongelmans, Jessica D. Workum, Olivier W. M. Thijssens · Mar 26, 2026 · Citations: 0
This can be conceived as an optimal stopping scenario with three added challenges: 1) the evaluation of a stopping strategy from observational data is itself a complex causal inference problem, 2) the composite objective is to minimize the…
- GlowQ: Group-Shared LOw-Rank Approximation for Quantized LLMs
Selim An, Il hong Suh, Yeseong Kim · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Does Structured Intent Representation Generalize? A Cross-Language, Cross-Model Empirical Study of 5W3H Prompting
Peng Gang · Mar 26, 2026 · Citations: 0
We study PPS (Prompt Protocol Specification), a 5W3H-based framework for structured intent representation in human-AI interaction, and extend prior Chinese-only evidence along three dimensions: two additional languages (English and…
- Supercharging Federated Intelligence Retrieval
Dimitris Stripelis, Patrick Foley, Mohammad Naseri, William Lindskog-Münzing, Chong Shen Ng · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Integrating Deep RL and Bayesian Inference for ObjectNav in Mobile Robotics
João Castelo-Branco, José Santos-Victor, Alexandre Bernardino · Mar 26, 2026 · Citations: 0
Web Browsing
Autonomous object search is challenging for mobile robots operating in indoor environments due to partial observability, perceptual uncertainty, and the need to trade off exploration and navigation efficiency.
- 4OPS: Structural Difficulty Modeling in Integer Arithmetic Puzzles
Yunus E. Zeytuncu · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Image Rotation Angle Estimation: Comparing Circular-Aware Methods
Maximilian Woehrer · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Large Language Model as Token Compressor and Decompressor
Wenbing Li, Zikai Song, Jielei Zhang, Tianhao Zhao, Junkai Lin · Mar 26, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Agentic Trust Coordination for Federated Learning through Adaptive Thresholding and Autonomous Decision Making in Sustainable and Resilient Industrial Networks
Paul Shepherd, Tasos Dagiuklas, Bugra Alkan, Jonathan Rodriguez · Mar 26, 2026 · Citations: 0
This paper presents a lightweight agentic trust coordination approach for FL in sustainable and resilient industrial networks.