PoSh: Using Scene Graphs To Guide LLMs-as-a-Judge For Detailed Image Descriptions
Amith Ananthram, Elias Stengel-Eskin, Lorena A. Bradford, Julia Demarest, Adam Purvis, Keith Krut ยท Oct 21, 2025
Citations: 0
Rubric Rating Human EvalLlm As Judge General
- While vision-language models (VLMs) have advanced into detailed image description, evaluation remains a challenge.
- In this work, we introduce PoSh, a metric for detailed image description that uses scene graphs as structured rubrics to guide LLMs-as-a-Judge, producing aggregate scores grounded in fine-grained errors (e.g.