- TSPC: A Two-Stage Phoneme-Centric Architecture for code-switching Vietnamese-English Speech Recognition
Tran Nguyen Anh, Truong Dinh Dung, Vo Van Nam, Minh N. H. Nguyen · Sep 7, 2025 · Citations: 0
- New Insights into Optimal Alignment of Acoustic and Linguistic Representations for Knowledge Transfer in ASR
Xugang Lu, Peng Shen, Hisashi Kawai · Sep 6, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- BinaryShield: Cross-Service Threat Intelligence in LLM Services using Privacy-Preserving Fingerprints
Waris Gill, Natalie Isak, Matthew Dressman · Sep 6, 2025 · Citations: 0
- No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata
Jessica M. Lundin, Ada Zhang, David Adelani, Cody Carroll · Sep 5, 2025 · Citations: 0
Using only a handful of features, token fertility ratios, token counts, and basic linguistic metadata (language family, script, and region), we can forecast ChrF scores for GPT-4o translations across 203 languages in the FLORES-200…
- Post-training Large Language Models for Diverse High-Quality Responses
Yilei Chen, Souradip Chakraborty, Lorenz Wolf, Yannis Paschalidis, Aldo Pacchiano · Sep 5, 2025 · Citations: 0
- Self-adaptive Dataset Construction for Real-World Multimodal Safety Scenarios
Jingen Qu, Lijun Li, Bo Zhang, Yichen Yan, Jing Shao · Sep 4, 2025 · Citations: 0
Multimodal large language models (MLLMs) are rapidly evolving, presenting increasingly complex safety challenges.
- MultiWikiQA: A Reading Comprehension Benchmark in 300+ Languages
Dan Saattrup Smart · Sep 4, 2025 · Citations: 0
- Mitigating Multimodal Hallucinations via Gradient-based Self-Reflection
Shan Wang, Maying Shen, Nadine Chang, Chuong Nguyen, Hongdong Li · Sep 3, 2025 · Citations: 0
Experiments across multiple benchmarks demonstrate that GACD effectively reduces hallucinations and improves the visual grounding of MLLM outputs.
- Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR
Jiaming Li, Longze Chen, Ze Gong, Yukun Chen, Lu Wang · Sep 2, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Do LLMs Adhere to Label Definitions? Examining Their Receptivity to External Label Definitions
Seyedali Mohammadi, Bhaskara Hanuma Vedula, Hemank Lamba, Edward Raff, Ponnurangam Kumaraguru · Sep 2, 2025 · Citations: 0
To address these questions, we conduct controlled experiments across multiple explanation benchmark datasets (general and domain-specific) and label definition conditions, including expert-curated, LLM-generated, perturbed, and swapped…
- BioBlue: Systematic runaway-optimiser-like LLM failure modes on biologically and economically aligned AI safety benchmarks for LLMs with simplified observation format
Roland Pihlakas, Sruthi Susan Kuriakose · Sep 2, 2025 · Citations: 0
Long Horizon
Many AI alignment discussions of "runaway optimisation" focus on RL agents: unbounded utility maximisers that over-optimise a proxy objective (e.g., "paperclip maximiser", specification gaming) at the expense of everything else.
- Error Notebook-Guided, Training-Free Part Retrieval in 3D CAD Assemblies via Vision-Language Models
Yunqing Liu, Nan Zhang, Zhiming Tan · Sep 1, 2025 · Citations: 0
Pairwise Preference Long Horizon
We additionally contribute a CAD dataset with human preference annotations.