- Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation
Richard J. Young · Mar 20, 2026 · Citations: 0
Automatic Metrics General
Three classifiers (a regex-only detector, a regex-plus-LLM pipeline, and a Claude Sonnet 4 judge) are applied to 10,276 influenced reasoning traces from 12 open-weight models spanning 9 families and 7B to 1T parameters.
- PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations
Vittoria Vineis, Matteo Silvestri, Lorenzo Antonelli, Filippo Betello, Gabriele Tolomei · Mar 6, 2026 · Citations: 0
Human Eval General
To address these challenges, we present PONTE (Personalized Orchestration for Natural language Trustworthy Explanations), a human-in-the-loop framework for adaptive and reliable XAI narratives.
- Reason and Verify: A Framework for Faithful Retrieval-Augmented Generation
Eeham Khan, Luis Rodriguez, Marc Queudot · Mar 10, 2026 · Citations: 0
Automatic Metrics Medicine
We evaluate this framework on the BioASQ and PubMedQA benchmarks, specifically analyzing the impact of dynamic in-context learning and rerank- ing under constrained token budgets.
- From Evidence-Based Medicine to Knowledge Graph: Retrieval-Augmented Generation for Sports Rehabilitation and a Domain Benchmark
Jinning Zhang, Jie Song, Wenhui Tu, Zecheng Li, Jingxuan Li · Jan 1, 2026 · Citations: 0
Automatic Metrics Medicine
Validated in sports rehabilitation, we release a knowledge graph (357,844 nodes, 371,226 edges) and a benchmark of 1,637 QA pairs.
- DeceptGuard :A Constitutional Oversight Framework For Detecting Deception in LLM Agents
Snehasis Mukhopadhyay · Mar 14, 2026 · Citations: 0
Automatic MetricsSimulation Env General
We introduce DECEPTGUARD, a unified framework that systematically compares three monitoring regimes: black-box monitors (actions and outputs only), CoT-aware monitors (additionally observing the agent's chain-of-thought reasoning trace),…
- PaperBanana: Automating Academic Illustration for AI Scientists
Dawei Zhu, Rui Meng, Yale Song, Xiyu Wei, Sujian Li · Jan 30, 2026 · Citations: 0
Automatic Metrics General
To lift this burden, we introduce PaperBanana, an agentic framework for automated generation of publication-ready academic illustrations.
- LLM-as-a-Judge for Time Series Explanations
Preetham Sivalingam, Murari Mandal, Saurabh Deshpande, Dhruv Kumar · Apr 2, 2026 · Citations: 0
Llm As JudgeAutomatic Metrics General
Although modern models generate textual interpretations of numerical signals, existing evaluation methods are limited: reference based similarity metrics and consistency checking models require ground truth explanations, while traditional…
- Replayable Financial Agents: A Determinism-Faithfulness Assurance Harness for Tool-Using LLM Agents
Raffi Khatchadourian · Jan 17, 2026 · Citations: 0
Automatic Metrics General
We introduce the Determinism-Faithfulness Assurance Harness (DFAH), a framework for measuring trajectory determinism, decision determinism, and evidence-conditioned faithfulness in tool-using agents deployed in financial services.
- Counterfactual Simulation Training for Chain-of-Thought Faithfulness
Peter Hase, Christopher Potts · Feb 24, 2026 · Citations: 0
Automatic MetricsSimulation Env Coding
In this paper, we introduce a training method called Counterfactual Simulation Training (CST), which aims to improve CoT faithfulness by rewarding CoTs that enable a simulator to accurately predict a model's outputs over counterfactual…
- Verify Before You Commit: Towards Faithful Reasoning in LLM Agents via Self-Auditing
Wenhao Yuan, Chenchen Lin, Jian Chen, Jinfeng Xu, Xuehe Wang · Apr 9, 2026 · Citations: 0
Automatic Metrics General
In large language model (LLM) agents, reasoning trajectories are treated as reliable internal beliefs for guiding actions and updating memory.
- CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era
Zhengqing Yuan, Kaiwen Shi, Zheyuan Zhang, Lichao Sun, Nitesh V. Chawla · Feb 26, 2026 · Citations: 0
Automatic Metrics General
Meanwhile, rapidly growing reference lists make manual verification impractical, and existing automated tools remain fragile to noisy and heterogeneous citation formats and lack standardized evaluation.
- One Model for All: Multi-Objective Controllable Language Models
Qiang He, Yucheng Yang, Tianyi Zhou, Meng Fang, Mykola Pechenizkiy · Apr 6, 2026 · Citations: 0
- Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA
Saahil Mathur, Ryan David Rittner, Vedant Ajit Thakur, Daniel Stuart Schiff, Tunazzina Islam · Mar 25, 2026 · Citations: 0
- Steering LLMs for Culturally Localized Generation
Simran Khanuja, Hongbin Liu, Shujian Zhang, John Lambert, Mingqing Chen · Mar 24, 2026 · Citations: 0
- When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language Models Frequently Bypass Their Own Reasoning
Abhinaba Basu, Pavan Chakraborty · Mar 24, 2026 · Citations: 0
- Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?
Richard J. Young · Mar 23, 2026 · Citations: 0
- WeNLEX: Weakly Supervised Natural Language Explanations for Multilabel Chest X-ray Classification
Isabel Rio-Torto, Jaime S. Cardoso, Luís F. Teixeira · Mar 19, 2026 · Citations: 0
- L2GTX: From Local to Global Time Series Explanations
Ephrem Tibebe Mekonnen, Luca Longo, Lucas Rizzo, Pierpaolo Dondio · Mar 13, 2026 · Citations: 0
- BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion
Varun Iyer, Cornelia Caragea · Mar 12, 2026 · Citations: 0
- Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval
Artem Vazhentsev, Maria Marina, Daniil Moskovskiy, Sergey Pletenev, Mikhail Seleznyov · Mar 5, 2026 · Citations: 0
- C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning
Avni Mittal, Rauno Arike · Mar 5, 2026 · Citations: 0
- SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Elisabetta Fedele, Francis Engelmann, Ian Huang, Or Litany, Marc Pollefeys · Dec 5, 2025 · Citations: 0
- T-FIX: Text-Based Explanations with Features Interpretable to eXperts
Shreya Havaldar, Weiqiu You, Chaehyeon Kim, Anton Xue, Helen Jin · Nov 6, 2025 · Citations: 0
- VeriTrail: Closed-Domain Hallucination Detection with Traceability
Dasha Metropolitansky, Jonathan Larson · May 27, 2025 · Citations: 0