Verifying Chain-of-Thought Reasoning via Its Computational Graph
Zheng Zhao, Yeskendir Koishekenov, Xianjun Yang, Naila Murray, Nicola Cancedda
Abstract
Current Chain-of-Thought (CoT) verification methods predict reasoning correctness based on outputs (black-box) or activations (gray-box), but offer limited insight into why a computation fails. We introduce a white-box method: Circuit-based Reasoning Verification (CRV). We hypothesize that attribution graphs of correct CoT steps, viewed as execution traces of the model's latent reasoning circuits, possess distinct st...
Summary
This paper introduces Circuit-based Reasoning Verification (CRV), a white-box method for verifying chain-of-thought (CoT) reasoning by constructing attribution graphs over reasoning steps and training classifiers on their structural features. The authors show that structural signatures in these graphs differ between correct and incorrect reasoning, are domain-specific across synthetic Boolean, synthetic arithmetic, and GSM8K tasks, and can guide targeted interventions on transcoder features to correct faulty reasoning trajectories.
Key Contributions
- Proposes Circuit-based Reasoning Verification (CRV), a white-box CoT verification method that builds attribution graphs over reasoning steps and classifies their structural features to detect errors.
- Demonstrates that structural fingerprints of attribution graphs are highly predictive of reasoning correctness and differ between correct and incorrect CoT steps.
- Finds that error-related structural signatures are strongly domain-specific across synthetic Boolean, synthetic arithmetic, and GSM8K reasoning tasks.
- Uses CRV-derived analysis to perform targeted interventions on individual transcoder features, showing that suppressing specific features can correct faulty reasoning.
- Curates step-level labeled datasets for synthetic Boolean, synthetic arithmetic, and GSM8K reasoning traces, and evaluates CRV against alternative diagnostic classifiers.
Reproducibility Notes
- No verified public implementation; all reproduction must be paper-driven.
- Reproduction plan relies on reconstructing methods from the PDF and citation graph.
- Step-level labeling pipelines and attribution-graph details must be re-specified.
- Expect multi-day setup and compute for meaningful CRV experiments.
Results & Benchmarks
| Task | Dataset | Metric | Value |
|---|---|---|---|
| Natural language processing | Synthetic (Boolean) | Final Answer Accuracy. | 98.4 |
| Natural language processing | Synthetic (Arithmetic) | % Correct | 98.8 |
| Natural language processing | GSM8K | % Correct | 93.4 |
Hardware Requirements
- Expect multi-day setup/compute for meaningful reproduction based on current guidance.
Best Implementation
Maintained implementation evidence is not confirmed for this paper yet.
Use the Implementation Status and Reproduction Path sections below for the current action plan.
Reproduction Path
Follow this baseline workflow to decide if this paper is worth immediate prototyping.
- 1
Use the paper and benchmark evidence to scope a baseline reproduction plan.
- 2
Track assumptions and missing details in an experiment log before coding.
Additional Implementations
No additional verified repositories beyond the primary recommendation.
Hugging Face Artifacts
No direct paper-linked artifacts were found. Showing strongest curated related artifacts.