Lingxiang Hu, Yiding Sun, Tianle Xia, Wenwei Li, Ming Xu, Liqun Liu · Feb 15, 2026
- While Large Language Model (LLM) agents have achieved remarkable progress in complex reasoning tasks, evaluating their performance in real-world environments has become a critical problem.
- To address this gap, we propose AD-Bench, a benchmark designed based on real-world business requirements of advertising and marketing platforms.