HFEPX Metric Hub
Kappa In CS.CL Papers
Updated from current HFEPX corpus (Apr 11, 2026). 11 papers are grouped in this metric page.
Read Full Context
Updated from current HFEPX corpus (Apr 11, 2026). 11 papers are grouped in this metric page. Common evaluation modes: Automatic Metrics, Human Eval. Most common rater population: Domain Experts. Common annotation unit: Pairwise. Frequent quality control: Inter Annotator Agreement Reported. Common metric signal: kappa. Use this page to compare protocol setup, judge behavior, and labeling design decisions before running new eval experiments. Newest paper in this set is from Mar 31, 2026.