BiasCause: Evaluate Socially Biased Causal Reasoning of Large Language Models
Tian Xie, Tongxin Yin, Vaishakh Keshava, Xueru Zhang, Siddhartha Reddy Jonnalagadda · Apr 8, 2025 · Citations: 0
How to use this page
Provisional trustThis page is a lightweight research summary built from the abstract and metadata while deeper extraction catches up.
Best use
Background context only
What to verify
Read the full paper before copying any benchmark, metric, or protocol choices.
Evidence quality
Provisional
Derived from abstract and metadata only.
Abstract
While large language models (LLMs) play increasingly significant roles in society, research shows they continue to generate content that reflects social bias against sensitive groups. Existing benchmarks effectively identify these biases, but a critical gap remains in understanding the underlying reasoning processes that produce them. This paper addresses this gap by evaluating the causal reasoning of LLMs when answering socially biased questions. We propose a formal schema that categorizes causal reasoning into three types (mistaken, biased, and contextually-grounded). We then synthesize 1788 questions covering eight sensitive attributes, with each set of questions designed to probe a specific type of causal reasoning. All questions are then manually validated, and each of them prompts the LLM to generate a causal graph behind its answer. We evaluate four state-of-the-art LLMs and find that all models exhibit biased causal reasoning on most questions eliciting it. Moreover, we discover that LLMs are also prone to "mistaken-biased" reasoning, where they first confuse correlation with causality to infer sensitive group membership and subsequently apply biased causal reasoning. By examining the cases where LLMs produce unbiased causal reasoning, we also identify three strategies LLMs employ to avoid bias (i.e., explicitly refusing to answer, avoiding sensitive attributes, and adding contextual restrictions), which provide insights for future debiasing efforts.