- SCOPE: Selective Conformal Optimized Pairwise LLM Judging
Sher Badshah, Ali Emami, Hassan Sajjad · Feb 13, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Large language models (LLMs) are increasingly used as judges to replace costly human preference labels in pairwise evaluation.
- Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalization
Qiyao Ma, Dechen Gao, Rui Cai, Boqi Zhao, Hanchu Zhou · Apr 8, 2026 · Citations: 0
Pairwise PreferenceRubric Rating Human EvalAutomatic Metrics
Pluralistic alignment has emerged as a critical frontier in the development of Large Language Models (LLMs), with reward models (RMs) serving as a central mechanism for capturing diverse human values.
- Elo-Evolve: A Co-evolutionary Framework for Language Model Alignment
Jing Zhao, Ting Zhen, Junwei Bao, Hongfei Jiang, Yang Song · Feb 14, 2026 · Citations: 0
Pairwise Preference Automatic Metrics Multi Agent
Current alignment methods for Large Language Models (LLMs) rely on compressing vast amounts of human preference data into static, absolute reward functions, leading to data scarcity, noise sensitivity, and training instability.
- PEARL: Self-Evolving Assistant for Time Management with Reinforcement Learning
Bingxuan Li, Jeonghwan Kim, Cheng Qian, Xiusi Chen, Eitan Anzenberg · Jan 17, 2026 · Citations: 0
Pairwise Preference Automatic Metrics Long Horizon
To enable a systematic study of this question, we introduce CalConflictBench, a benchmark for long-horizon calendar conflict resolution.
- MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks
Zexue He, Yu Wang, Churan Zhi, Yuanzhe Hu, Tzu-Ping Chen · Feb 18, 2026 · Citations: 0
Pairwise Preference Automatic Metrics Web Browsing
Existing evaluations of agents with memory typically assess memorization and action in isolation.
- From Consensus to Split Decisions: ABC-Stratified Sentiment in Holocaust Oral Histories
Daban Q. Jaff · Mar 30, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
After assembling model outputs, we introduce an agreement-based stability taxonomy (ABC) to stratify inter-model output stability.
- Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation
Richard J. Young · Mar 20, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Three classifiers (a regex-only detector, a regex-plus-LLM pipeline, and a Claude Sonnet 4 judge) are applied to 10,276 influenced reasoning traces from 12 open-weight models spanning 9 families and 7B to 1T parameters.
- Yor-Sarc: A gold-standard dataset for sarcasm detection in a low-resource African language
Toheeb Aduramomi Jimoh, Tabea De Wille, Nikola S. Nikolov · Feb 21, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
This protocol incorporates context-sensitive interpretation and community-informed guidelines and is accompanied by a comprehensive analysis of inter-annotator agreement to support replication in other African languages.
- Error Notebook-Guided, Training-Free Part Retrieval in 3D CAD Assemblies via Vision-Language Models
Yunqing Liu, Nan Zhang, Zhiming Tan · Sep 1, 2025 · Citations: 0
Pairwise Preference Automatic Metrics Long Horizon
We additionally contribute a CAD dataset with human preference annotations.
- Toward Safe and Human-Aligned Game Conversational Recommendation via Multi-Agent Decomposition
Zheng Hui, Xiaokai Wei, Yexi Jiang, Kevin Gao, Chen Wang · Apr 26, 2025 · Citations: 0
Pairwise Preference Automatic Metrics Multi Agent
These domains typically involve fixed content and passive consumption, where user preferences can be matched by genre or theme.
- HyperMem: Hypergraph Memory for Long-Term Conversations
Juwei Yue, Chuanrui Hu, Jiawei Sheng, Zuyi Zhou, Wenyuan Zhang · Apr 9, 2026 · Citations: 0
Pairwise Preference Llm As JudgeAutomatic Metrics
Long-term memory is essential for conversational agents to maintain coherence, track persistent tasks, and provide personalized interactions across extended dialogues.
- Signals: Trajectory Sampling and Triage for Agentic Interactions
Shuguang Chen, Adil Hafeez, Salman Paracha · Apr 1, 2026 · Citations: 0
Pairwise Preference Automatic Metrics Long Horizon
We propose a lightweight, signal-based framework for triaging agentic interaction trajectories.
- Learning When to Act: Interval-Aware Reinforcement Learning with Predictive Temporal Structure
Davide Di Gioia · Mar 23, 2026 · Citations: 0
Pairwise Preference Automatic Metrics Long Horizon
Autonomous agents operating in continuous environments must decide not only what to do, but when to act.
- The Geometry of Dialogue: Graphing Language Models to Reveal Synergistic Teams for Multi-Agent Collaboration
Kotaro Furuya, Yuichi Kitagawa · Oct 30, 2025 · Citations: 0
Pairwise Preference Automatic Metrics Multi Agent
While a multi-agent approach based on large language models (LLMs) represents a promising strategy to surpass the capabilities of single models, its success is critically dependent on synergistic team composition.
- How Reliable is Language Model Micro-Benchmarking?
Gregory Yauney, Shahzaib Saqib Warraich, Swabha Swayamdipta · Oct 9, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
We introduce a meta-evaluation measure for micro-benchmarking which investigates how well a micro-benchmark can rank two models as a function of their performance difference on the full benchmark.
- Same Words, Different Judgments: Modality Effects on Preference Alignment
Aaron Broukhim, Nadir Weibel, Eshin Jolly · Feb 26, 2026 · Citations: 0
Pairwise PreferenceRlaif Or Synthetic Feedback Automatic Metrics
Preference-based reinforcement learning (PbRL) is the dominant framework for aligning AI systems to human preferences, but its application to speech remains underexplored.
- Aligning Multimodal Sequential Recommendations via Robust Direct Preference Optimization with Sparse MoE
Hejin Huang, Jusheng Zhang, Kaitong Cai, Jian Wang, Rong Pan · Mar 31, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Preference-based alignment objectives have been widely adopted, from RLHF-style pairwise learning in large language models to emerging applications in recommender systems.
- PrefDisco: Benchmarking Proactive Personalized Reasoning
Shuyue Stella Li, Avinandan Bose, Faeze Brahman, Simon Shaolei Du, Pang Wei Koh · Sep 30, 2025 · Citations: 0
Pairwise PreferenceRubric Rating Automatic Metrics
We introduce PrefDisco, an evaluation methodology that transforms static benchmarks into interactive personalization tasks using psychologically-grounded personas with sparse, context-dependent preferences, and define PrefAlign as a…
- ClimateCheck 2026: Scientific Fact-Checking and Disinformation Narrative Classification of Climate-related Claims
Raia Abu Ahmad, Max Upravitelev, Aida Usmanova, Veronika Solopova, Georg Rehm · Mar 27, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
In addition to standard evaluation metrics (Recall@K and Binary Preference), we adapt an automated framework to assess retrieval quality under incomplete annotations, exposing systematic biases in how conventional metrics rank systems.
- DSPA: Dynamic SAE Steering for Data-Efficient Preference Alignment
James Wedgwood, Aashiq Muhamed, Mona T. Diab, Virginia Smith · Mar 23, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Preference alignment is usually achieved by weight-updating training on preference data, which adds substantial alignment-stage compute and provides limited mechanistic visibility.
- AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning
Nikolas Karafyllis, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou · Mar 4, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
We present a winning three-stage system for SemEval 2026 Task~12: Abductive Event Reasoning that combines graph-based retrieval, LLM-driven abductive reasoning with prompt design optimized through reflective prompt evolution, and post-hoc…
- Modeling Distinct Human Interaction in Web Agents
Faria Huq, Zora Zhiruo Wang, Zhanqiu Guo, Venu Arvind Arangarajan, Tianyue Ou · Feb 19, 2026 · Citations: 0
Pairwise Preference Automatic Metrics Web Browsing
In this work, we introduce the task of modeling human intervention to support collaborative web task execution.
- HUMORCHAIN: Theory-Guided Multi-Stage Reasoning for Interpretable Multimodal Humor Generation
Jiajun Zhang, Shijia Luo, Ruikang Zhang, Qi Su · Nov 21, 2025 · Citations: 0
Pairwise Preference Automatic Metrics Long Horizon
Humor, as both a creative human activity and a social binding mechanism, has long posed a major challenge for AI generation.
- BEAT: Visual Backdoor Attacks on VLM-based Embodied Agents via Contrastive Trigger Learning
Qiusi Zhan, Hyeonjeong Ha, Rui Yang, Sirui Xu, Hanyang Chen · Oct 31, 2025 · Citations: 0
Pairwise Preference Automatic Metrics Long Horizon
We introduce BEAT, the first framework to inject such visual backdoors into VLM-based embodied agents using objects in the environments as triggers.
- Just as Humans Need Vaccines, So Do Models: Model Immunization to Combat Falsehoods
Shaina Raza, Rizwan Qureshi, Azib Farooq, Marcelo Lotif, Aman Chadha · May 23, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
Unlike post-hoc filtering or preference-based alignment, immunization introduces direct negative supervision on labeled falsehoods.
- VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
Yuchen Yan, Jin Jiang, Zhenbang Ren, Yijun Li, Xudong Cai · May 21, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
In this paper, we introduce VerifyBench and its challenging variant VerifyBench-Hard, two benchmarks specifically designed to assess reference-based reward systems.
- CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou · Jan 28, 2025 · Citations: 0
Pairwise PreferenceDemonstrations Automatic Metrics Web Browsing
We propose CowPilot, a framework supporting autonomous as well as human-agent collaborative web navigation, and evaluation across task success and task efficiency.
- MMEmb-R1: Reasoning-Enhanced Multimodal Embedding with Pair-Aware Selection and Adaptive Control
Yuchi Wang, Haiyang Yu, Weikang Bian, Jiefeng Long, Xiao Liang · Apr 7, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Experiments on the MMEB-V2 benchmark demonstrate that our model achieves a score of 71.2 with only 4B parameters, establishing a new state-of-the-art while significantly reducing reasoning overhead and inference latency.
- Optimizing RAG Rerankers with LLM Feedback via Reinforcement Learning
Yuhang Wu, Xiangqing Shen, Fanfan Wang, Cangqi Zhou, Zhen Wu · Apr 2, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
However, current reranking models are typically optimized on static human annotated relevance labels in isolation, decoupled from the downstream generation process.
- Preference learning in shades of gray: Interpretable and bias-aware reward modeling for human preferences
Simona-Vasilica Oprea, Adela Bâra · Apr 1, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Using the Anthropic HHRLHF dataset, we evaluate ten diverse large language models LLMs under a standard pairwise preference setting, where baseline performance remains below 0.74 ROC AUC, highlighting the difficulty of the task.
- MemRerank: Preference Memory for Personalized Product Reranking
Zhiyuan Peng, Xuyang Wu, Huaixiao Tou, Yi Fang, Yu Gong · Mar 31, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
LLM-based shopping agents increasingly rely on long purchase histories and multi-turn interactions for personalization, yet naively appending raw history to prompts is often ineffective due to noise, length, and relevance mismatch.
- Routing Sensitivity Without Controllability: A Diagnostic Study of Fairness in MoE Language Models
Junhyeok Lee, Kyu Sung Choi · Mar 28, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
FARE reveals that routing-level preference shifts are either unachievable (Mixtral, Qwen1.5, Qwen3), statistically non-robust (DeepSeekMoE), or accompanied by substantial utility cost (OLMoE, -4.4%p CrowS-Pairs at -6.3%p TQA).
- A Comparative Empirical Study of Catastrophic Forgetting Mitigation in Sequential Task Adaptation for Continual Natural Language Processing Systems
Aram Abrahamyan, Sachin Kumar · Mar 19, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Performance is assessed with average accuracy, macro F1, and backward transfer, capturing the stability-plasticity trade-off across the task sequence.
- CAMEL: Confidence-Gated Reflection for Reward Modeling
Zirui Zhu, Hailun Xu, Yang Luo, Yong Liu, Kanchan Sarkar · Feb 24, 2026 · Citations: 0
Pairwise PreferenceCritique Edit Automatic Metrics
Building on this insight, we propose CAMEL, a confidence-gated reflection framework that performs a lightweight single-token preference decision first and selectively invokes reflection only for low-confidence instances.
- Learning Ordinal Probabilistic Reward from Preferences
Longze Chen, Lu Wang, Renke Shan, Ze Gong, Run Luo · Feb 13, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Reward models are crucial for aligning large language models (LLMs) with human values and intentions.
- Should LLMs, like, Generate How Users Talk? Building Dialect-Accurate Dialog[ue]s Beyond the American Default with MDial
Jio Oh, Paul Vicinanza, Thomas Butler, Steven Euijong Whang, Dezhi Hong · Jan 30, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Independent evaluations confirm data quality, with annotators preferring MDial outputs over prior methods in 98% of pairwise comparisons for dialect naturalness.
- Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, and LLaMA
Hanyu Cai, Binqi Shen, Lier Jin, Lan Hu, Xiaojing Fan · Dec 14, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
In this work, we propose a systematic evaluation framework to examine how interaction tone affects model accuracy and apply it to three recently released and widely available LLMs: GPT-4o mini (OpenAI), Gemini 2.0 Flash (Google DeepMind),…
- PoLi-RL: A Point-to-List Reinforcement Learning Framework for Conditional Semantic Textual Similarity
Zixin Song, Bowen Zhang, Qian-Wen Zhang, Di Yin, Xing Sun · Oct 5, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
On the official C-STS benchmark, PoLi-RL achieves a Spearman correlation coefficient of 48.18, establishing a new SOTA for the cross-encoder architecture.
- PLOT: Enhancing Preference Learning via Optimal Transport
Liang Zhu, Yuelin Bai, Xiankun Ren, Jiaxi Yang, Lei Zhang · Apr 2, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Preference learning in Large Language Models (LLMs) has advanced significantly, yet existing methods remain limited by modest performance gains, high computational costs, hyperparameter sensitivity, and insufficient modeling of global…
- ThinknCheck: Grounded Claim Verification with Compact, Reasoning-Driven, and Interpretable Models
Delip Rao, Feijiang Han, Chris Callison-Burch · Apr 2, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
By contrast, zero-shot chain-of-thought on the base Gemma3-1B harms accuracy relative to direct answers, and preference optimization with a simple format+accuracy reward underperforms supervised reasoning.
- OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework
Ben Chen, Siyuan Wang, Yufei Ma, Zihan Liang, Xuxin Zhang · Mar 25, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
However, its inadequate understanding of complex queries, inefficient exploitation of latent user intents, and overfitting to narrow historical preferences have limited its further performance improvement.
- BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents
Praveen Kumar Myakala, Manan Agrawal, Rahul Manche · Mar 25, 2026 · Citations: 0
Pairwise PreferenceCritique Edit Automatic Metrics
LLMs are increasingly used as long-running conversational agents, yet every major benchmark evaluating their memory treats user information as static facts to be stored and retrieved.
- LLMs Can Infer Political Alignment from Online Conversations
Byunghwee Lee, Sangyeon Kim, Filippo Menczer, Yong-Yeol Ahn, Haewoon Kwak · Mar 11, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Due to the correlational structure in our traits such as identities, cultures, and political attitudes, seemingly innocuous preferences such as following a band or using a specific slang, can reveal private traits.
- Can Safety Emerge from Weak Supervision? A Systematic Analysis of Small Language Models
Punyajoy Saha, Sudipta Halder, Debjyoti Mondal, Subhadarshi Panda · Mar 7, 2026 · Citations: 0
Pairwise PreferenceRed Team Automatic Metrics
Safety alignment is critical for deploying large language models (LLMs) in real-world applications, yet most existing approaches rely on large human-annotated datasets and static red-teaming benchmarks that are costly, difficult to scale,…
- When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger
Amirabbas Afzali, Myeongho Jeon, Maria Brbic · Mar 5, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Building on this insight, we propose Confidence-Weighted Preference Optimization (CW-PO), a general framework that re-weights training samples by a weak LLM's confidence and can be applied across different preference optimization…
- LocalSUG: Geography-Aware LLM for Query Suggestion in Local-Life Services
Jinwen Chen, Shuai Gong, Shiwen Zhang, Zheng Zhang, Yachao Zhao · Mar 5, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
While LLMs offer strong semantic generalization, deploying them in local-life services introduces three key challenges: lack of geographic grounding, exposure bias in preference optimization, and online inference latency.
- MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization
Ashutosh Chaubey, Jiacheng Pang, Mohammad Soleymani · Mar 3, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
In this work, we propose Modality-Decoupled Direct Preference Optimization (MoD-DPO), a simple and effective framework for improving modality grounding in omni LLMs.
- RLHFless: Serverless Computing for Efficient RLHF
Rui Wei, Hanfei Yu, Shubham Jain, Yogarajan Sivakumar, Devesh Tiwari · Feb 26, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Reinforcement Learning from Human Feedback (RLHF) has been widely applied to Large Language Model (LLM) post-training to align model outputs with human preferences.
- DynamicGTR: Leveraging Graph Topology Representation Preferences to Boost VLM Capabilities on Graph QAs
Yanbin Wei, Jiangyue Yan, Chun Kang, Yang Chen, Hua Liu · Feb 25, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
This ``one-size-fits-all'' strategy often neglects model-specific and task-specific preferences, resulting in inaccurate or over-lengthy responses to graph-related queries.
- PACIFIC: Can LLMs Discern the Traits Influencing Your Preferences? Evaluating Personality-Driven Preference Alignment in LLMs
Tianyu Zhao, Siqi Li, Yasser Shoukry, Salma Elmalaki · Feb 6, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Based on these findings, we introduce PACIFIC (Preference Alignment Choices Inference for Five-factor Identity Characterization), a personality-labeled preference dataset containing 1200 preference statements spanning diverse domains (e.g.,…
- Robust Preference Alignment via Directional Neighborhood Consensus
Ruochen Mao, Yuling Shi, Xiaodong Gu, Jiaheng Wei · Oct 23, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
To address this challenge, we introduce Robust Preference Selection (RPS), a post-hoc, training-free method by leveraging directional neighborhood consensus.
- Difficulty-Controllable Multiple-Choice Question Generation Using Large Language Models and Direct Preference Optimization
Yuto Tomikawa, Masaki Uto · Oct 22, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
To address these limitations, this study proposes a novel difficulty-controllable multiple-choice question generation method for reading comprehension which leverages a large language model trained using a direct preference optimization…
- Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability
Taylor Sorensen, Benjamin Newman, Jared Moore, Chan Park, Jillian Fisher · Oct 7, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
On many tasks such as creative writing, synthetic data generation, or steering to diverse preferences, models must cover an entire distribution of outputs, rather than a single correct answer.
- Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence
Wenzhe Yin, Zehao Xiao, Pan Zhou, Shujian Yu, Jiayi Shen · Feb 24, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
Vision-language alignment is crucial for various downstream tasks such as cross-modal generation and retrieval.
- CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning
Zhijiang Tang, Linhua Wang, Jiaxin Qi, Weihao Jiang, Peng Hou · Feb 25, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Image captioning remains a fundamental task for vision language understanding, yet ground-truth supervision still relies predominantly on human-annotated references.
- Investigation for Relative Voice Impression Estimation
Kenichi Fujita, Yusuke Ijima · Feb 15, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
The estimation target is a low-dimensional vector derived from subjective evaluations, quantifying the perceptual shift of the second utterance relative to the first along an antonymic axis (e.g., ``Dark--Bright'').
- The ASIR Courage Model: A Phase-Dynamic Framework for Truth Transitions in Human and AI Systems
Hyo Jin Kim · Feb 25, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
Although initially formulated for human truth-telling under asymmetric stakes, the same phase-dynamic architecture extends to AI systems operating under policy constraints and alignment filters.
- Probing Graph Neural Network Activation Patterns Through Graph Topology
Floriano Tori, Lorenzo Bini, Marco Sorbi, Stéphane Marchand-Maillet, Vincent Ginis · Feb 24, 2026 · Citations: 0
Pairwise Preference Automatic Metrics
However, it remains unclear how the topology of a graph interacts with the learned preferences of GNNs.
- Dual-IPO: Dual-Iterative Preference Optimization for Text-to-Video Generation
Xiaomeng Yang, Mengping Yang, Jia Gong, Luozheng Qin, Zhiyu Tan · Feb 4, 2025 · Citations: 0
Pairwise Preference Automatic Metrics
However, they usually fail to produce satisfactory outputs that are aligned to users' authentic demands and preferences.