Researcher Tools

Human Feedback and Eval Paper Explorer

A focused feed for RLHF, preference data, rater protocols, agent evaluation, and LLM-as-judge research. Every paper includes structured metadata for quick triage.

Total papers: 1 Search mode: vector_fallback_keyword Shortlist (0) RSS

Filter by tag

All Automatic Metrics (620) General (224) Long Horizon (124) Pairwise Preference (112) Coding (91) Simulation Env (83) Multi Agent (58) Medicine (40) Expert Verification (39) Llm As Judge (36) Web Browsing (32) Demonstrations (29) Human Eval (29) Red Team (29) Rubric Rating (29) Math (28)

Browse by Topic

Jump directly into tag and hub pages to crawl deeper content clusters.

Top Protocol Hubs

Start Here By Objective

Pick your immediate research objective and jump directly to high-signal pages, not generic search.

Benchmark Selection

Find papers with explicit benchmark anchors and comparable metric reporting.

Rater Protocol Design

Compare pairwise, rubric, and expert-verification setups before drafting your protocol.

LLM-as-Judge Setup

Start with established judge pipelines and then compare with human-eval references.

No exact ID match for "2603.04974". Showing results for "VRM: Teaching Reward Models to Understand Authentic Human Preferences" instead.

VRM: Teaching Reward Models to Understand Authentic Human Preferences

Biao Liu, Ning Xu, Junming Yang, Hao Xu, Xin Geng · Mar 5, 2026

Citations: 0

Match reason: Matched by broad semantic/index fallback.

Score: 45% Moderate protocol signal Freshness: Hot Status: Ready

Pairwise Preference Human Eval General

Large Language Models (LLMs) have achieved remarkable success across diverse natural language tasks, yet the reward models employed for aligning LLMs often encounter challenges of reward hacking, where the approaches predominantly rely on…
Motivated by this consideration, we propose VRM, i.e., Variational Reward Modeling, a novel framework that explicitly models the evaluation process of human preference judgments by incorporating both high-dimensional objective weights and…

Open paper