MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks
Zexue He, Yu Wang, Churan Zhi, Yuanzhe Hu, Tzu-Ping Chen, Lang Yin ยท Feb 18, 2026
Citations: 0
Pairwise Preference Automatic Metrics Web Browsing General
- Existing evaluations of agents with memory typically assess memorization and action in isolation.
- To capture this setting, we introduce MemoryArena, a unified evaluation gym for benchmarking agent memory in multi-session Memory-Agent-Environment loops.