Skip to content
← Back to explorer

RVR: Retrieve-Verify-Retrieve for Comprehensive Question Answering

Deniz Qian, Hung-Ting Chen, Eunsol Choi · Feb 20, 2026 · Citations: 0

Abstract

Comprehensively retrieving diverse documents is crucial to address queries that admit a wide range of valid answers. We introduce retrieve-verify-retrieve (RVR), a multi-round retrieval framework designed to maximize answer coverage. Initially, a retriever takes the original query and returns a candidate document set, followed by a verifier that identifies a high-quality subset. For subsequent rounds, the query is augmented with previously verified documents to uncover answers that are not yet covered in previous rounds. RVR is effective even with off-the-shelf retrievers, and fine-tuning retrievers for our inference procedure brings further gains. Our method outperforms baselines, including agentic search approaches, achieving at least 10% relative and 3% absolute gain in complete recall percentage on a multi-answer retrieval dataset (QAMPARI). We also see consistent gains on two out-of-domain datasets (QUEST and WebQuestionsSP) across different base retrievers. Our work presents a promising iterative approach for comprehensive answer recall leveraging a verifier and adapting retrievers to a new inference scenario.

Human Data Lens

  • Uses human feedback: No
  • Feedback types: None
  • Rater population: Unknown
  • Unit of annotation: Unknown
  • Expertise required: General

Evaluation Lens

  • Evaluation modes: Automatic Metrics
  • Agentic eval: None
  • Quality controls: Not reported
  • Confidence: 0.45
  • Flags: low_signal, possible_false_positive

Research Summary

Contribution Summary

  • Comprehensively retrieving diverse documents is crucial to address queries that admit a wide range of valid answers.
  • We introduce retrieve-verify-retrieve (RVR), a multi-round retrieval framework designed to maximize answer coverage.
  • Initially, a retriever takes the original query and returns a candidate document set, followed by a verifier that identifies a high-quality subset.

Why It Matters For Eval

  • Our method outperforms baselines, including agentic search approaches, achieving at least 10% relative and 3% absolute gain in complete recall percentage on a multi-answer retrieval dataset (QAMPARI).

Related Papers