- Multi-Method Validation of Large Language Model Medical Translation Across High- and Low-Resource Languages
Chukwuebuka Anyaegbuna, Eduardo Juan Perez Guerrero, Jerry Liu, Timothy Keyes, April Liang · Mar 23, 2026 · Citations: 0
- LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
Hailay Teklehaymanot, Dren Fazlija, Wolfgang Nejdl · Mar 23, 2026 · Citations: 0
- Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length
Jingxuan Chen, Mohammad Taher Pilehvar, Jose Camacho-Collados · Mar 23, 2026 · Citations: 0
- Language Models Can Explain Visual Features via Steering
Javier Ferrando, Enrique Lopez-Cuena, Pablo Agustin Martin-Torres, Daniel Hinjos, Anna Arias-Duart · Mar 23, 2026 · Citations: 0
- Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?
Richard J. Young · Mar 23, 2026 · Citations: 0
- CAPITU: A Benchmark for Evaluating Instruction-Following in Brazilian Portuguese with Literary Context
Giovana Kerche Bonás, Roseval Malaquias Junior, Marcos Piau, Thiago Laitz, Thales Sales Almeida · Mar 23, 2026 · Citations: 0
- Reddit After Roe: A Computational Analysis of Abortion Narratives and Barriers in the Wake of Dobbs
Aria Pessianzadeh, Alex H. Poole, Rezvaneh Rezapour · Mar 23, 2026 · Citations: 0
Long Horizon
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos
Shoubin Yu, Lei Shu, Antoine Yang, Yao Fu, Srinivas Sunkara · Mar 23, 2026 · Citations: 0
To address this gap, we introduce Ego2Web, the first benchmark designed to bridge egocentric video perception and web agent execution.
- Generating and Evaluating Sustainable Procurement Criteria for the Swiss Public Sector using In-Context Prompting with Large Language Models
Yingqiang Gao, Veton Matoshi, Luca Rolshoven, Tilia Ellendorff, Judith Binder · Mar 23, 2026 · Citations: 0
Expert Verification
Swiss law requires the integration of ecological, social, and economic sustainability requirements into tender evaluations in the format of criteria that have to be fulfilled by a bidder.
- Rashid: A Cipher-Based Framework for Exploring In-Context Language Learning
Niyati Bafna, Ryan Soh-Eun Shim, Barbara Plank, David Yarowsky, Hale Sirin · Mar 23, 2026 · Citations: 0
We use our framework to assess current methods in the field with SOTA evaluation tools and manual analysis, explore the utility of potentially expensive resources in improving ICLL, and test ICLL strategies on rich downstream tasks beyond…
- Tiny Inference-Time Scaling with Latent Verifiers
Davide Bucciarelli, Evelyn Turri, Lorenzo Baraldi, Marcella Cornia, Lorenzo Baraldi · Mar 23, 2026 · Citations: 0
- Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures
Hector Borobia, Elies Seguí-Mas, Guillermina Tormo-Carbó · Mar 23, 2026 · Citations: 0
Through group ablations, layer-wise sweeps, positional ablations, matched random controls, and perplexity analysis across five benchmarks, we establish four findings: (1) both component types are essential and neither is bypassed; (2) the…
- LLM-guided headline rewriting for clickability enhancement without clickbait
Yehudit Aperstein, Linoy Halifa, Sagiv Bar, Alexander Apartsin · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Towards Automated Community Notes Generation with Large Vision Language Models for Combating Contextual Deception
Jin Ma, Jingwen Yan, Mohammed Aldeen, Ethan Anderson, Taran Kavuru · Mar 23, 2026 · Citations: 0
Multi Agent
However, its reliance on human contributors limits both the timeliness and scalability.
- Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs
Haoming Meng, Kexin Huang, Shaohang Wei, Chiyu Ma, Shuo Yang · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- WorldCache: Content-Aware Caching for Accelerated Video World Models
Umair Nawaz, Ahmed Heakl, Ufaq Khan, Abdelrahman Shaker, Salman Khan · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model
Haichao Zhang, Yijiang Li, Shwai He, Tushar Nagarajan, Mingfei Chen · Mar 23, 2026 · Citations: 0
Long Horizon
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- TiCo: Time-Controllable Training for Spoken Dialogue Models
Kai-Wei Chang, Wei-Chih Chen, En-Pei Hu, Hung-yi Lee, James Glass · Mar 23, 2026 · Citations: 0
This capability is valuable for real-world spoken language systems such as voice assistants and interactive agents, where controlling response duration can improve interaction quality.
- Greater accessibility can amplify discrimination in generative AI
Carolin Holtermann, Minh Duc Bui, Kaitlyn Zhou, Valentin Hofmann, Katharina von der Wense · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents
Ling Yue, Kushal Raj Bhandari, Ching-Yun Ko, Dhaval Patel, Shuxin Lin · Mar 23, 2026 · Citations: 0
- MemDLM: Memory-Enhanced DLM Training
Zehua Pei, Hui-Ling Zhen, Weizhe Lin, Sinno Jialin Pan, Yunhe Wang · Mar 23, 2026 · Citations: 0
- Dyadic: A Scalable Platform for Human-Human and Human-AI Conversation Research
David M. Markowitz · Mar 23, 2026 · Citations: 0
- Adapting Self-Supervised Speech Representations for Cross-lingual Dysarthria Detection in Parkinson's Disease
Abner Hernandez, Eunjung Yeo, Kwanghee Choi, Chin-Jou Li, Zhengjun Yue · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Gumbel Distillation for Parallel Text Generation
Chi Zhang, Xixi Hu, Bo Liu, Qiang Liu · Mar 23, 2026 · Citations: 0
- SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection
Kexian Tang, Jiani Wang, Shaowen Wang, Kaifeng Lyu · Mar 23, 2026 · Citations: 0
- Enhancing Document-Level Machine Translation via Filtered Synthetic Corpora and Two-Stage LLM Adaptation
Ireh Kim, Tesia Sker, Chanwoo Kim · Mar 23, 2026 · Citations: 0
- Learning When to Act: Interval-Aware Reinforcement Learning with Predictive Temporal Structure
Davide Di Gioia · Mar 23, 2026 · Citations: 0
Pairwise Preference Long Horizon
Autonomous agents operating in continuous environments must decide not only what to do, but when to act.
- Beyond Matching to Tiles: Bridging Unaligned Aerial and Satellite Views for Vision-Only UAV Navigation
Kejia Liu, Haoyang Zhou, Ruoyu Xu, Peicheng Wang, Mingli Song · Mar 23, 2026 · Citations: 0
- The Semantic Ladder: A Framework for Progressive Formalization of Natural Language Content for Knowledge Graphs and AI Systems
Lars Vogt · Mar 23, 2026 · Citations: 0
- Multiperspectivity as a Resource for Narrative Similarity Prediction
Max Upravitelev, Veronika Solopova, Jing Yang, Charlott Jakob, Premtim Sahitaj · Mar 23, 2026 · Citations: 0
- Autoregressive vs. Masked Diffusion Language Models: A Controlled Comparison
Caio Vicentino · Mar 23, 2026 · Citations: 0
- Dual-Space Knowledge Distillation with Key-Query Matching for Large Language Models with Vocabulary Mismatch
Stella Eva Tsiapali, Cong-Thanh Do, Kate Knill · Mar 23, 2026 · Citations: 0
- Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models
Hayeon Kim, Ji Ha Jang, Junghun James Kim, Se Young Chun · Mar 23, 2026 · Citations: 0
- ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention
Xinyan Wang, Xiaogeng Liu, Chaowei Xiao · Mar 23, 2026 · Citations: 0
Across seven benchmarks, ROM achieves the highest accuracy (93.51%), the shortest responses (1,159 tokens), and the best response efficiency.
- Retrieving Climate Change Disinformation by Narrative
Max Upravitelev, Veronika Solopova, Charlott Jakob, Premtim Sahitaj, Sebastian Möller · Mar 23, 2026 · Citations: 0
We repurpose three climate disinformation datasets (CARDS, Climate Obstruction, climate change subset of PolyNarrative) for retrieval evaluation and propose SpecFi, a framework that generates hypothetical documents to bridge the gap between…
- On the Challenges and Opportunities of Learned Sparse Retrieval for Code
Simon Lupart, Maxime Louis, Thibault Formal, Hervé Déjean, Stéphane Clinchant · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- SegMaFormer: A Hybrid State-Space and Transformer Model for Efficient Segmentation
Duy D. Nguyen, Phat T. Tran-Truong · Mar 23, 2026 · Citations: 0
Despite its compact structure, SegMaFormer achieves competitive performance on three public benchmarks (Synapse, BraTS, and ACDC), matching the Dice coefficient of significantly larger models.
- λ-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks
Cristian Pérez-Corral, Alberto Fernández-Hernández, Jose I. Mestre, Manuel F. Dolz, Enrique S. Quintana-Ortí · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- TREX: Trajectory Explanations for Multi-Objective Reinforcement Learning
Dilina Rajapakse, Juan C. Rosero, Ivana Dusparic · Mar 23, 2026 · Citations: 0
Pairwise Preference Long Horizon
Multi-Objective Reinforcement Learning (MORL) addresses this limitation by enabling agents to optimize several objectives simultaneously, explicitly reasoning about trade-offs between them.
- LRC-WeatherNet: LiDAR, RADAR, and Camera Fusion Network for Real-time Weather-type Classification in Autonomous Driving
Nour Alhuda Albashir, Lars Pernickel, Danial Hamoud, Idriss Gouigah, Eren Erdal Aksoy · Mar 23, 2026 · Citations: 0
Web Browsing
Autonomous vehicles face major perception and navigation challenges in adverse weather such as rain, fog, and snow, which degrade the performance of LiDAR, RADAR, and RGB camera sensors.
- SecureBreak -- A dataset towards safe and secure models
Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera · Mar 23, 2026 · Citations: 0
Red Team
To provide a contribution in this scenario, this paper introduces SecureBreak, a safety-oriented dataset designed to support the development of AI-driven solutions for detecting harmful LLM outputs caused by residual weaknesses in security…
- Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe
Xixi Wu, Qianguo Sun, Ruiyang Zhang, Chao Song, Junlong Wu · Mar 23, 2026 · Citations: 0
Long Horizon
Reinforcement Learning (RL) is essential for evolving Large Language Models (LLMs) into autonomous agents capable of long-horizon planning, yet a practical recipe for scaling RL in complex, multi-turn environments remains elusive.
- Parameter-Efficient Fine-Tuning for Medical Text Summarization: A Comparative Study of Lora, Prompt Tuning, and Full Fine-Tuning
Ulugbek Shernazarov, Rostislav Svitsov, Bin Shi · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- BHDD: A Burmese Handwritten Digit Dataset
Swan Htet Aung, Hein Htet, Htoo Say Wah Khaing, Thuya Myo Nyunt · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Suiren-1.0 Technical Report: A Family of Molecular Foundation Models
Junyi An, Xinyu Lu, Yun-Fei Shi, Li-Cheng Xu, Nannan Zhang · Mar 23, 2026 · Citations: 0
Our extensive evaluations demonstrate that Suiren-1.0 establishes state-of-the-art results across a range of tasks.
- SLURP-TN : Resource for Tunisian Dialect Spoken Language Understanding
Haroun Elleuch, Salima Mdhaffar, Yannick Estève, Fethi Bougares · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Chronological Contrastive Learning: Few-Shot Progression Assessment in Irreversible Diseases
Clemens Watzenböck, Daniel Aletaha, Michaël Deman, Thomas Deimel, Jana Eder · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Camera-Agnostic Pruning of 3D Gaussian Splats via Descriptor-Based Beta Evidence
Peter Fasogbon, Ugurcan Budak, Patrice Rondao Alface, Hamed Rezazadegan Tavakoli · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Guideline-grounded retrieval-augmented generation for ophthalmic clinical decision support
Shuying Chen, Sen Cui, Zhong Cao · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Deep Reinforcement Learning and The Tale of Two Temporal Difference Errors
Juan Sebastian Rojas, Chi-Guhn Lee · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- SHAPE: Structure-aware Hierarchical Unsupervised Domain Adaptation with Plausibility Evaluation for Medical Image Segmentation
Linkuan Zhou, Yinghao Xia, Yufei Shen, Xiangyu Li, Wenjie Du · Mar 23, 2026 · Citations: 0
To address these issues, we propose SHAPE (Structure-aware Hierarchical Unsupervised Domain Adaptation with Plausibility Evaluation), a framework that reframes adaptation towards global anatomical plausibility.
- Ara-Best-RQ: Multi Dialectal Arabic SSL
Haroun Elleuch, Ryan Whetten, Salima Mdhaffar, Yannick Estève, Fethi Bougares · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Not All Layers Are Created Equal: Adaptive LoRA Ranks for Personalized Image Generation
Donald Shenaj, Federico Errica, Antonio Carta · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- SmaAT-QMix-UNet: A Parameter-Efficient Vector-Quantized UNet for Precipitation Nowcasting
Nikolas Stavrou, Siamak Mehrkanoon · Mar 23, 2026 · Citations: 0
Three configurations are benchmarked: using only VQ, only MixConv, and the full SmaAT-QMix-UNet.
- P^2O: Joint Policy and Prompt Optimization
Xinyu Lu, Kaiqi Zhang, Jinglin Yang, Boxi Cao, Yaojie Lu · Mar 23, 2026 · Citations: 0
Extensive experiments demonstrate that P^2O not only achieves superior performance on in-distribution datasets but also exhibits strong generalization, yielding substantial improvements on out-of-distribution benchmarks (+4.7% avg.).
- Disentangling Speaker Traits for Deepfake Source Verification via Chebyshev Polynomial and Riemannian Metric Learning
Xi Xuan, Wenxin Zhang, Zhiyu Li, Jennifer Williams, Ville Hautamäki · Mar 23, 2026 · Citations: 0
Experimental results on MLAAD benchmark, evaluated under four newly proposed protocols designed for source-speaker disentanglement scenarios, demonstrate the effectiveness of SDML framework.
- Manifold-Aware Exploration for Reinforcement Learning in Video Generation
Mingzhe Zheng, Weijie Kong, Yue Wu, Dengyang Jiang, Yue Ma · Mar 23, 2026 · Citations: 0
Long Horizon
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Adversarial Camouflage
Paweł Borsukiewicz, Daniele Lunghi, Melissa Tessa, Jacques Klein, Tegawendé F. Bissyandé · Mar 23, 2026 · Citations: 0
Optimized patterns, once found, are projected onto semantically valid facial regions for evaluation.
- Tacit Knowledge Management with Generative AI: Proposal of the GenAI SECI Model
Naoshi Uchihira · Mar 23, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation
Yuyang You, Yongzhi Li, Jiahui Li, Yadong Mu, Quan Chen · Mar 23, 2026 · Citations: 0
Extensive experiments and ablation studies on the VBench and VBench2 benchmarks demonstrate that our method achieves stable few-step video synthesis, significantly enhancing perceptual fidelity and motion realism.