OpenTrain Research Tools
Human Feedback and Eval Paper Explorer
A focused feed for RLHF, preference data, rater protocols, agent evaluation, and LLM-as-judge research. Every paper includes structured metadata for quick triage.
Filter by tag
No papers matched this filter set. Try removing tags or using a broader query.
Protocol Hubs
Expert Verification Papers (25)
CS.CL + Expert Verification Papers (20)
Pairwise Preference Papers (70)
CS.CL + Pairwise Preference Papers (62)
CS.AI + Expert Verification Papers (15)
CS.AI + Pairwise Preference Papers (42)
Rubric Rating Papers (17)
CS.CL + Rubric Rating Papers (16)
General + Pairwise Preference Papers (43)
Expert Verification Or Rubric Rating Papers (39)
CS.CL + Math Papers (84)
Long Horizon Papers (82)
CS.CL + Human Eval Papers (35)
CS.CL + Long Horizon Papers (58)
Expert Verification + Medicine Papers (11)
Human Eval Papers (38)
Benchmark Hubs
- Retrieval Benchmark Papers (115)
- Retrieval Benchmark Papers (Last 365 Days) (111)
- Retrieval Or GSM8K Benchmark Papers (128)
- Retrieval Or MMLU Benchmark Papers (128)
- Retrieval Or MATH Benchmark Papers (135)
- Retrieval Or DROP Benchmark Papers (129)
- Retrieval Or MMLU Or GSM8K Benchmark Papers (139)
- Retrieval Or MATH Or GSM8K Benchmark Papers (148)
- Retrieval Or MATH Or MMLU Benchmark Papers (146)
- Retrieval Or DROP Or GSM8K Benchmark Papers (141)