Skip to content

Researcher Tools

Human Feedback and Eval Paper Explorer

A focused feed for RLHF, preference data, rater protocols, agent evaluation, and LLM-as-judge research. Every paper includes structured metadata for quick triage.

Total papers: 664 Search mode: keyword Shortlist (0) RSS

Featured Papers

Popular high-signal papers with direct links to full protocol pages.

Browse by Topic

Jump directly into tag and hub pages to crawl deeper content clusters.

Popular Tags

Top Protocol Hubs

Weekly Eval Paper Digest

The top RLHF, evaluation, and human feedback papers — curated and summarized every Friday.

No spam. Unsubscribe anytime.

Start Here By Objective

Pick your immediate research objective and jump directly to high-signal pages, not generic search.

Scale Your Evaluation Team

Need human evaluators for your benchmark or preference study? OpenTrain sources pre-vetted domain experts into your annotation pipeline.

Citations: 0

Match reason: Keyword overlap 2/2 across title and protocol fields.

Score: 73% Sparse protocol signal Freshness: Warm Status: Ready
General
  • This paper examines whether their framework accurately characterizes human intelligence and its implications for conceptualizing artificial general intelligence.
  • Drawing on empirical evidence from cognitive science, I demonstrate that human expertise operates primarily through domain-specific pattern accumulation rather than elegant compression.
Open paper
Sabiá-4 Technical Report

Thiago Laitz, Thales Sales Almeida, Hugo Abonizio, Roseval Malaquias Junior, Giovana Kerche Bonás, Marcos Piau · Mar 10, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 61% High protocol signal Freshness: Warm Status: Ready
Pairwise Preference Automatic Metrics Tool Use LawCoding
  • The models were developed through a four-stage training pipeline: continued pre-training on Portuguese and Brazilian legal corpora, long-context extension to 128K tokens, supervised fine-tuning on instruction data spanning chat, code, legal…
  • We evaluate the models on six benchmark categories: conversational capabilities in Brazilian Portuguese, knowledge of Brazilian legislation, long-context understanding, instruction following, standardized exams, and agentic capabilities…
Open paper
Adaptive Activation Cancellation for Hallucination Mitigation in Large Language Models

Eric Yocam, Varghese Vaidyan, Gurcan Comert, Paris Kalathas, Yong Wang, Judith L. Mwakalonge · Mar 10, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 61% Moderate protocol signal Freshness: Warm Status: Ready
Automatic Metrics General
  • Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
Open paper
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Kaiser Sun, Xiaochuang Yuan, Hongjun Liu, Chen Zhao, Cheng Zhang, Mark Dredze · Mar 10, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 61% Moderate protocol signal Freshness: Warm Status: Ready
Automatic Metrics Math
  • We systematically diagnose this "modality gap" by evaluating seven MLLMs across seven benchmarks in five input modes, spanning both synthetically rendered text and realistic document images from arXiv PDFs to Wikipedia pages.
  • Motivated by these findings, we propose a self-distillation method that trains the model on its own pure text reasoning traces paired with image inputs, raising image-mode accuracy on GSM8K from 30.71% to 92.72% and transferring to unseen…
Open paper
SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

Junxian Li, Tu Lan, Haozhen Tan, Yan Meng, Haojin Zhu · Mar 9, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 57% Moderate protocol signal Freshness: Warm Status: Ready
Automatic Metrics Coding
  • Modern vision-language-model (VLM) based graphical user interface (GUI) agents are expected not only to execute actions accurately but also to respond to user instructions with low latency.
  • In this paper, we introduce SlowBA, a novel backdoor attack that targets the responsiveness of VLM-based GUI agents.
Open paper
Evaluating LLM-Based Grant Proposal Review via Structured Perturbations

William Thorne, Joseph James, Yang Wang, Chenghua Lin, Diana Maynard · Mar 9, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 54% Sparse protocol signal Freshness: Warm Status: Ready
Human Eval LawCoding
  • As AI-assisted grant proposals outpace manual review capacity in a kind of ``Malthusian trap'' for the research ecosystem, this paper investigates the capabilities and limitations of LLM-based grant reviewing for high-stakes evaluation.
  • Human evaluation shows LLM feedback is largely valid but skewed toward compliance checking over holistic assessment.
Open paper
Emotion is Not Just a Label: Latent Emotional Factors in LLM Processing

Benjamin Reichman, Adar Avsian, Samuel Webster, Larry Heck · Mar 10, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 51% Sparse protocol signal Freshness: Warm Status: Ready
General
  • To facilitate controlled study of these effects, we introduce Affect-Uniform ReAding QA (AURA-QA), a question-answering dataset with emotionally balanced, human-authored context passages.
  • Experiments across multiple QA benchmarks demonstrate that this approach improves reading comprehension in both emotionally-varying and non-emotionally varying datasets, yielding consistent gains under distribution shift and in-domain…
Open paper
Fish Audio S2 Technical Report

Shijia Liao, Yuxuan Wang, Songting Liu, Yifan Cheng, Ruoyi Zhang, Tianyu Li · Mar 9, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 51% Sparse protocol signal Freshness: Warm Status: Ready
Coding
  • Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
Open paper
Gender Bias in MT for a Genderless Language: New Benchmarks for Basque

Amaia Murillo, Olatz-Perez-de-Viñaspre, Naiara Perez · Mar 9, 2026

Citations: 0

Match reason: Keyword overlap 1/2 across title and protocol fields.

Score: 54% Sparse protocol signal Freshness: Warm Status: Fallback
Pairwise Preference Multilingual
  • WinoMTeus adapts the WinoMT benchmark to examine how gender-neutral Basque occupations are translated into gendered languages such as Spanish and French.
  • FLORES+Gender, in turn, extends the FLORES+ benchmark to assess whether translation quality varies when translating from gendered languages (Spanish and English) into Basque depending on the gender of the referent.
Open paper
IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

Chuan Guo, Juan Felipe Ceron Uribe, Sicheng Zhu, Christopher A. Choquette-Choo, Steph Lin, Nikhil Kandpal · Mar 11, 2026

Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 38% Moderate protocol signal Freshness: Warm Status: Ready
Red Team Automatic Metrics General
  • IH is key to defending against jailbreaks, system prompt extractions, and agentic prompt injections.
  • Fine-tuning GPT-5-Mini on IH-Challenge with online adversarial example generation improves IH robustness by +10.0% on average across 16 in-distribution, out-of-distribution, and human red-teaming benchmarks (84.1% to 94.1%), reduces unsafe…
Open paper
Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 35% Moderate protocol signal Freshness: Warm Status: Ready
Multi Agent General
  • Moltbook is the first large-scale social network built for autonomous AI agent-to-agent interaction.
  • Early studies on Moltbook have interpreted its agent discourse as evidence of peer learning and emergent social behaviour, but there is a lack of systematic understanding of the thematic, affective, and interactional properties of Moltbook…
Open paper
Video-Based Reward Modeling for Computer-Use Agents

Linxin Song, Jieyu Zhang, Huanxin Sheng, Taiwei Shi, Gupta Rahul, Yang Liu · Mar 10, 2026

Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 38% Moderate protocol signal Freshness: Warm Status: Fallback
Automatic Metrics Long Horizon Multilingual
  • Computer-using agents (CUAs) are becoming increasingly capable; however, it remains difficult to scale evaluation of whether a trajectory truly fulfills a user instruction.
  • In this work, we study reward modeling from execution video: a sequence of keyframes from an agent trajectory that is independent of the agent's internal reasoning or actions.
Open paper
TA-Mem: Tool-Augmented Autonomous Memory Retrieval for LLM in Long-Term Conversational QA

Mengwei Yuan, Jianan Liu, Jing Yang, Xianyou Li, Weiran Yan, Yichao Wu · Mar 10, 2026

Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 32% Sparse protocol signal Freshness: Warm Status: Ready
Tool Use General
  • To address this inflexibility, we introduced a novel tool-augmented autonomous memory retrieval framework (TA-Mem), which contains: (1) a memory extraction LLM agent which is prompted to adaptively chuck an input into sub-context based on…
Open paper
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness

Subramanyam Sahoo, Aman Chadha, Vinija Jain, Divya Chaudhary · Mar 10, 2026

Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 32% Sparse protocol signal Freshness: Warm Status: Ready
Long Horizon General
  • We further analyze why current safety measures are insufficient to prevent this escalation.
  • We conclude by proposing concrete safeguards, including a "Mirror Test" benchmark and a Reasoning Safety Parity Principle, and pose an uncomfortable but necessary question to the logical reasoning community about its responsibility in this…
Open paper

Match reason: Matched by broad semantic/index fallback.

Score: 32% Sparse protocol signal Freshness: Warm Status: Ready
MathCoding
  • We introduce a taxonomy of three defense categories -- output perturbation, data poisoning, and information throttling -- and evaluate nine defense configurations using a standardized pipeline with Qwen3-14B as teacher and…
Open paper
Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 28% Sparse protocol signal Freshness: Warm Status: Ready
Law
  • Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
Open paper
Deconstructing Multimodal Mathematical Reasoning: Towards a Unified Perception-Alignment-Reasoning Paradigm

Tianyu Yang, Sihong Wu, Yilun Zhao, Zhenwen Liang, Lisen Dai, Chen Zhao · Mar 9, 2026

Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 28% Sparse protocol signal Freshness: Warm Status: Ready
Math
  • Moreover, existing evaluations mainly focus on checking final answers rather than verifying the correctness or executability of each intermediate step.
Open paper

Match reason: Matched by broad semantic/index fallback.

Score: 28% Sparse protocol signal Freshness: Warm Status: Ready
General
  • Using Croissant-compliant dataset of 2400+ stereotypical and anti-stereotypical sentence pairs on gender roles across social domains, we implement an evaluation framework, Dual-Metric Bias Assessment (DMBA), combining two metrics: (1)…
Open paper

Protocol Hubs

Get Started

Join the #1 Platform for AI Training Talent

Where top AI builders and expert AI Trainers connect to build the future of AI.
Self-Service
Post a Job
Post your project and get a shortlist of qualified AI Trainers and Data Labelers. Hire and manage your team in the tools you already use.
Managed Service
For Large Projects
Done-for-You
We recruit, onboard, and manage a dedicated team inside your tools. End-to-end operations for large or complex projects.
For Freelancers
Join as an AI Trainer
Find AI training and data labeling projects across platforms, all in one place. One profile, one application process, more opportunities.