Skip to content
← Back to explorer

Benchmark Hub

Retrieval Benchmark Papers

Updated from current HFEPX corpus (Feb 26, 2026). 94 papers are grouped in this benchmark page. Common evaluation modes: Automatic Metrics, Simulation Env. Frequently cited benchmark: retrieval. Common metric signal: accuracy. Newest paper in this set is from Feb 25, 2026.

Papers: 94 Last published: Feb 25, 2026 Global RSS

Research Utility Snapshot

Human Feedback Mix

  • Pairwise Preference (7)
  • Demonstrations (2)
  • Expert Verification (2)
  • Critique Edit (1)

Evaluation Modes

  • Automatic Metrics (86)
  • Simulation Env (8)
  • Human Eval (2)

Top Papers On This Benchmark

Other Benchmark Hubs