Counterfactual Simulation Training for Chain-of-Thought Faithfulness
Peter Hase, Christopher Potts ยท Feb 24, 2026
Citations: 0
Automatic MetricsSimulation Env Coding
OpenTrain Research Tools
A focused feed for RLHF, preference data, rater protocols, agent evaluation, and LLM-as-judge research. Every paper includes structured metadata for quick triage.
Peter Hase, Christopher Potts ยท Feb 24, 2026