Skip to content
← Back to explorer

Toward Safe and Human-Aligned Game Conversational Recommendation via Multi-Agent Decomposition

Zheng Hui, Xiaokai Wei, Yexi Jiang, Kevin Gao, Chen Wang, Frank Ong, Se-eun Yoon, Rachit Pareek, Michelle Gong · Apr 26, 2025 · Citations: 0

Abstract

Conversational recommender systems (CRS) have advanced with large language models, showing strong results in domains like movies. These domains typically involve fixed content and passive consumption, where user preferences can be matched by genre or theme. In contrast, games present distinct challenges: fast-evolving catalogs, interaction-driven preferences (e.g., skill level, mechanics, hardware), and increased risk of unsafe responses in open-ended conversation. We propose MATCHA, a multi-agent framework for CRS that assigns specialized agents for intent parsing, tool-augmented retrieval, multi-LLM ranking with reflection, explanation, and risk control which enabling finer personalization, long-tail coverage, and stronger safety. Evaluated on real user request dataset, MATCHA outperforms six baselines across eight metrics, improving Hit@5 by 20%, reducing popularity bias by 24%, and achieving 97.9% adversarial defense. Human and virtual-judge evaluations confirm improved explanation quality and user alignment.

Human Data Lens

  • Uses human feedback: Yes
  • Feedback types: Pairwise Preference
  • Rater population: Unknown
  • Unit of annotation: Ranking
  • Expertise required: General

Evaluation Lens

  • Evaluation modes: Automatic Metrics
  • Agentic eval: Multi Agent
  • Quality controls: Not reported
  • Confidence: 0.75
  • Flags: None

Research Summary

Contribution Summary

  • Conversational recommender systems (CRS) have advanced with large language models, showing strong results in domains like movies.
  • These domains typically involve fixed content and passive consumption, where user preferences can be matched by genre or theme.
  • In contrast, games present distinct challenges: fast-evolving catalogs, interaction-driven preferences (e.g., skill level, mechanics, hardware), and increased risk of unsafe responses in open-ended conversation.

Why It Matters For Eval

  • These domains typically involve fixed content and passive consumption, where user preferences can be matched by genre or theme.
  • In contrast, games present distinct challenges: fast-evolving catalogs, interaction-driven preferences (e.g., skill level, mechanics, hardware), and increased risk of unsafe responses in open-ended conversation.

Related Papers