APEX-Agents
Bertie Vidgen, Austin Mann, Abby Fennelly, John Wright Stanly, Lucas Rothman, Marco Burstein · Jan 20, 2026
Citations: 0
Rubric RatingExpert Verification Automatic Metrics Long Horizon Law
- We introduce the AI Productivity Index for Agents (APEX-Agents), a benchmark for assessing whether AI agents can execute long-horizon, cross-application tasks created by investment banking analysts, management consultants, and corporate…
- We test eight agents for the leaderboard using Pass@1.