Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval
Hao Wu, Prateek Saxena · Nov 30, 2025 · Citations: 0
How to use this paper page
Coverage: StaleUse this page to decide whether the paper is strong enough to influence an eval design. It summarizes the abstract plus available structured metadata. If the signal is thin, use it as background context and compare it against stronger hub pages before making protocol choices.
Best use
Background context only
Metadata: StaleTrust level
Provisional
Signals: StaleWhat still needs checking
Structured extraction is still processing; current fields are metadata-first.
Signal confidence unavailable
Abstract
When answering user queries, LLMs often retrieve knowledge from external sources stored in retrieval-augmented generation (RAG) databases. These are often populated from unvetted sources, e.g. the open web, and can contain maliciously crafted data. This paper studies attacks that can manipulate the context retrieved by LLMs from such RAG databases. Prior work on such context manipulation primarily injects false or toxic content, which can often be detected by fact-checking or linguistic analysis. We reveal a more subtle threat, Epistemic Bias Injection (EBI), in which adversaries inject factually correct yet epistemically biased passages that systematically emphasize one side of a multi-viewpoint issue. Although linguistically coherent and truthful, such adversarial passages effectively crowd out alternative viewpoints and steer model outputs toward an attacker-chosen stance. As a core contribution, we propose a novel characterization of the problem: We give a geometric metric that quantifies epistemic bias. This metric can be computed directly on embeddings of text passages retrieved by the LLM. Leveraging this metric, we construct EBI attacks and develop a lightweight prototype defense called BiasDef for them. We evaluate them both on a comprehensive benchmark constructed from public question answering datasets.Our results show that: (1) the proposed attack induces significant perspective shifts, effectively evading existing retrieval-based sanitization defenses, and (2) BiasDef substantially reduces adversarial retrieval and bias in LLM's answers. Overall, this demonstrates the new threat as well as the ease of employing epistemic bias metrics for filtering in RAG-enabled LLMs.