- MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition
Mehran Shabanpour, Kasra Rad, Sadaf Khademi, Arash Mohammadi · Feb 9, 2025 · Citations: 0
High-Density surface Electromyography (HDsEMG) has emerged as a pivotal resource for Human-Computer Interaction (HCI), offering direct insights into muscle activities and motion intentions.
- A Systematic Survey of Semantic Role Labeling in the Era of Pretrained Language Models
Huiyao Chen, Meishan Zhang, Jing Li, Lilja Øvrelid, Jan Hajič · Feb 9, 2025 · Citations: 0
We extend the scope of SRL surveys to cover multimodal settings including visual, video, and speech modalities, and analyze structural differences in evaluation across these modalities.
- Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning
Manh Luong, Khai Nguyen, Dinh Phung, Gholamreza Haffari, Lizhen Qu · Feb 8, 2025 · Citations: 0
Our kernel also improves the reasoning accuracy of the MMAU-test-mini benchmarks by 4\%.
- Dynamic Noise Preference Optimization: Self-Improvement of Large Language Models with Self-Synthetic Data
Haoyan Yang, Khiem Le, Ting Hua, Shangqian Gao, Binfeng Xu · Feb 8, 2025 · Citations: 0
Pairwise Preference
To overcome these challenges, we introduce Dynamic Noise Preference Optimization (DNPO), which combines dynamic sample labeling for constructing preference pairs with controlled, trainable noise injection during preference optimization.
- Oracular Programming: A Modular Foundation for Building LLM-Enabled Software
Jonathan Laurent, André Platzer · Feb 7, 2025 · Citations: 0
Demonstrations Web Browsing
We propose oracular programming: a foundational paradigm for integrating traditional, explicit computations with inductive oracles such as LLMs.
- From Restless to Contextual: A Thresholding Bandit Reformulation For Finite-horizon Improvement
Jiamin Xu, Ivan Nazarov, Aditya Rastogi, África Periáñez, Kyra Gan · Feb 7, 2025 · Citations: 0
This paper addresses the poor finite-horizon performance of existing online restless bandit (RB) algorithms, which stems from the prohibitive sample complexity of learning a full Markov decision process (MDP) for each agent.
- Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs
Thierry Bossy, Julien Vignoud, Tahseen Rabbani, Juan R. Troncoso Pastoriza, Martin Jaggi · Feb 7, 2025 · Citations: 0
- HOG-Diff: Higher-Order Guided Diffusion for Graph Generation
Yiming Huang, Tolga Birdal · Feb 6, 2025 · Citations: 0
Pairwise Preference
Extensive experiments across eight graph generation benchmarks, spanning diverse domains and including large-scale settings, demonstrate the scalability of our method and its superior performance on both pairwise and higher-order…
- Physics-Informed Evolution: An Evolutionary Framework for Solving Quantum Control Problems Involving the Schrödinger Equation
Kaichen Ouyang, Mingyang Yu, Zong Ke, Jun Zhang, Yi Chen · Feb 6, 2025 · Citations: 0
We validate PIE on three representative quantum control benchmarks: state preparation in V-type three-level systems, entangled state generation in superconducting quantum circuits, and two-atom cavity QED systems.
- vCache: Verified Semantic Prompt Caching
Luis Gaspar Schroeder, Aditya Desai, Alejandro Cuadron, Kyle Chu, Shu Liu · Feb 6, 2025 · Citations: 0
We release the vCache implementation and four benchmarks to support future research.
- AStar: Boosting Multimodal Reasoning with Automated Structured Thinking
Jinyang Wu, Mingkuan Feng, Guocheng Zhai, Shuai Zhang, Zheng Lian · Feb 4, 2025 · Citations: 0
- Dual-IPO: Dual-Iterative Preference Optimization for Text-to-Video Generation
Xiaomeng Yang, Mengping Yang, Jia Gong, Luozheng Qin, Zhiyu Tan · Feb 4, 2025 · Citations: 0
Pairwise Preference
However, they usually fail to produce satisfactory outputs that are aligned to users' authentic demands and preferences.
- FinBloom: Knowledge Grounding Large Language Model with Real-time Financial Data
Ankur Sinha, Chaitanya Agarwal, Pekka Malo · Feb 4, 2025 · Citations: 0
- Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning
Jinlong Pang, Na Di, Zhaowei Zhu, Jiaheng Wei, Hao Cheng · Feb 4, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play
Zelai Xu, Ruize Zhang, Chao Yu, Huining Yuan, Xiangmin Yi · Feb 4, 2025 · Citations: 0
Demonstrations Multi Agent
We provide a comprehensive suite of tasks ranging from single-drone drills to multi-drone cooperative and competitive tasks, accompanied by baseline evaluations of representative reinforcement learning (RL), multi-agent reinforcement…
- Evaluation of Large Language Models via Coupled Token Generation
Nina Corvelo Benz, Stratis Tsirtsis, Eleni Straitouri, Ivi Chatzi, Ander Artola Velasco · Feb 3, 2025 · Citations: 0
Pairwise Preference
In this work, we argue that the evaluation and ranking of large language models should control for the randomization underpinning their functioning.
- Preference Leakage: A Contamination Problem in LLM-as-a-judge
Dawei Li, Renliang Sun, Yue Huang, Ming Zhong, Bohan Jiang · Feb 3, 2025 · Citations: 0
Pairwise Preference
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods in model development.
- Intrinsic Entropy of Context Length Scaling in LLMs
Jingzhe Shi, Qinwei Ma, Hongyi Liu, Hang Zhao, Jeng-Neng Hwang · Feb 3, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- GRADIEND: Feature Learning within Neural Networks Exemplified through Biases
Jonathan Drechsel, Steffen Herbold · Feb 3, 2025 · Citations: 0
- Polynomial, trigonometric, and tropical activations
Ismail Khalfaoui-Hassani, Stefan Kesselheim · Feb 3, 2025 · Citations: 0
- A Single Model Ensemble Framework for Neural Machine Translation using Pivot Translation
Seokjin Oh, Keonwoong Noh, Woohwan Jung · Feb 3, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration
Dongwon Jo, Jiwon Song, Yulhwa Kim, Jae-Joon Kim · Feb 3, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Unlocking Full Efficiency of Token Filtering in Large Language Model Training
Di Chai, Pengbo Li, Feiyuan Zhang, Yilun Jin, Han Tian · Feb 1, 2025 · Citations: 0
Evaluations on models with various scales--from 1.1B to 40B--demonstrate that Centrifuge reduces backpropagation time by up to 49.9\% and end-to-end training time by up to 34.7\% when filtering 50\% of tokens.
- Should You Use Your Large Language Model to Explore or Exploit?
Keegan Harris, Aleksandrs Slivkins · Jan 31, 2025 · Citations: 0
Tool Use
We evaluate the ability of the current generation of large language models (LLMs) to help a decision-making agent facing an exploration-exploitation tradeoff.
- Capturing Temporal Dynamics in Large-Scale Canopy Tree Height Estimation
Jan Pauls, Max Zimmer, Berkant Turan, Sassan Saatchi, Philippe Ciais · Jan 31, 2025 · Citations: 0
- Evaluating Spoken Language as a Biomarker for Automated Screening of Cognitive Impairment
Maria R. Lima, Alexander Capstick, Fatemeh Geranmayeh, Ramin Nilforooshan, Maja Matarić · Jan 30, 2025 · Citations: 0
We evaluate explainable ML for screening of Alzheimer's disease and related dementias (ADRD) and severity prediction using benchmark DementiaBank speech (N = 291, 64% female, 69.8 (SD = 8.6) years).
- Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations
Zijie Liu, Xinyu Zhao, Jie Peng, Zhuangdi Zhu, Qingyu Chen · Jan 29, 2025 · Citations: 0
These tuning methods and benchmarks overlook critical aspects like evidence-based reasoning and handling distracting information.
- Safe Reinforcement Learning for Real-World Engine Control
Julian Bedei, Lucas Koch, Kevin Badalian, Alexander Winkler, Patrick Schaber · Jan 28, 2025 · Citations: 0
This work introduces a toolchain for applying Reinforcement Learning (RL), specifically the Deep Deterministic Policy Gradient (DDPG) algorithm, in safety-critical real-world environments.
- CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou · Jan 28, 2025 · Citations: 0
Pairwise PreferenceDemonstrations Web Browsing
We propose CowPilot, a framework supporting autonomous as well as human-agent collaborative web navigation, and evaluation across task success and task efficiency.
- Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning
Weipu Zhang, Adam Jelley, Trevor McInroe, Amos Storkey, Gang Wang · Jan 27, 2025 · Citations: 0
Empirical results demonstrate that OC-STORM significantly outperforms the STORM baseline on the Atari 100k benchmark and achieves state-of-the-art sample efficiency on challenging boss fights in the visually complex game Hollow Knight.
- Representing data in words: A context engineering approach
Amandine M. Caut, Amy Rouillard, Beimnet Zenebe, Matthias Green, Ágúst Pálmason Morthens · Jan 27, 2025 · Citations: 0
Due to the absence of standardized benchmarks for this specific task, we conduct LLM-as-a-judge and human-as-a-judge evaluations to assess accuracy across the three applications.