300 canonical paper links on this archive page.
- Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalizationarxiv-2604.07343 Sparse Blocked context onlyApr 8, 2026
- A Systematic Study of Retrieval Pipeline Design for Retrieval-Augmented Medical Question Answeringarxiv-2604.07274 Sparse Blocked context onlyApr 8, 2026
- Joint Optimization of Reasoning and Dual-Memory for Self-Learning Diagnostic Agentarxiv-2604.07269 Sparse Blocked context onlyApr 8, 2026
- How Much LLM Does a Self-Revising Agent Actually Need?arxiv-2604.07236 Sparse Blocked context onlyApr 8, 2026
- TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectoriesarxiv-2604.07223 Sparse Blocked context onlyApr 8, 2026
- MoZoo:Unleashing Video Diffusion power in animal fur and muscle simulationarxiv-2605.13857 Sparse Blocked context onlyApr 8, 2026
- Agent-Driven Corpus Linguistics: A Framework for Autonomous Linguistic Discoveryarxiv-2604.07189 Sparse Blocked context onlyApr 8, 2026
- STRIDE-ED: A Strategy-Grounded Stepwise Reasoning Framework for Empathetic Dialogue Systemsarxiv-2604.07100 Sparse Blocked context onlyApr 8, 2026
- Is Cross-Lingual Transfer in Bilingual Models Human-Like? A Study with Overlapping Word Forms in Dutch and Englisharxiv-2604.07067 Sparse Blocked context onlyApr 8, 2026
- Sell More, Play Less: Benchmarking LLM Realistic Selling Skillarxiv-2604.07054 Sparse Blocked context onlyApr 8, 2026
- ReDAct: Uncertainty-Aware Deferral for LLM Agentsarxiv-2604.07036 Curated Related Blocked context onlyApr 8, 2026
- Gemma 4, Phi-4, and Qwen3: Accuracy-Efficiency Tradeoffs in Dense and MoE Reasoning Language Modelsarxiv-2604.07035 Sparse Blocked context onlyApr 8, 2026
- MARS: Enabling Autoregressive Models Multi-Token Generationarxiv-2604.07023 Sparse Blocked context onlyApr 8, 2026
- DTCRS: Dynamic Tree Construction for Recursive Summarizationarxiv-2604.07012 Sparse Blocked context onlyApr 8, 2026
- iTAG: Inverse Design for Natural Text Generation with Accurate Causal Graph Annotationsarxiv-2604.06902 Sparse Blocked context onlyApr 8, 2026
- On the Step Length Confounding in LLM Reasoning Data Selectionarxiv-2604.06834 Sparse Blocked context onlyApr 8, 2026
- Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLMarxiv-2604.06832 Sparse Blocked context onlyApr 8, 2026
- WRAP++: Web discoveRy Amplified Pretrainingarxiv-2604.06829 Sparse Blocked context onlyApr 8, 2026
- Beyond Accuracy: Diagnosing Algebraic Reasoning Failures in LLMs Across Nine Complexity Dimensionsarxiv-2604.06799 Sparse Blocked context onlyApr 8, 2026
- GCoT-Decoding: Unlocking Deep Reasoning Paths for Universal Question Answeringarxiv-2604.06794 Sparse Blocked context onlyApr 8, 2026
- From Perception to Autonomous Computational Modeling: A Multi-Agent Approacharxiv-2604.06788 Sparse Blocked context onlyApr 8, 2026
- Geometric Properties of the Voronoi Tessellation in Latent Semantic Manifolds of Large Language Modelsarxiv-2604.06767 Sparse Blocked context onlyApr 8, 2026
- How Long Reasoning Chains Influence LLMs' Judgment of Answer Factualityarxiv-2604.06756 Sparse Blocked context onlyApr 8, 2026
- Select-then-Solve: Paradigm Routing as Inference-Time Optimization for LLM Agentsarxiv-2604.06753 Sparse Blocked context onlyApr 8, 2026
- StructKV: Preserving the Structural Skeleton for Scalable Long-Context Inferencearxiv-2604.06746 Sparse Blocked context onlyApr 8, 2026
- WisdomInterrogatory (LuWen): An Open-Source Legal Large Language Model Technical Reportarxiv-2604.06737 Sparse Blocked context onlyApr 8, 2026
- Steering the Verifiability of Multimodal AI Hallucinationsarxiv-2604.06714 Sparse Blocked context onlyApr 8, 2026
- Adaptive Prompt Structure Factorization: A Framework for Self-Discovering and Optimizing Compositional Prompt Programsarxiv-2604.06699 Sparse Blocked context onlyApr 8, 2026
- ChemVLR: Prioritizing Reasoning in Perception for Chemical Vision-Language Understandingarxiv-2604.06685 Sparse Blocked context onlyApr 8, 2026
- Argus: Reorchestrating Static Analysis via a Multi-Agent Ensemble for Full-Chain Security Vulnerability Detectionarxiv-2604.06633 Sparse Blocked context onlyApr 8, 2026
- DiffuMask: Diffusion Language Model for Token-level Prompt Pruningarxiv-2604.06627 Sparse Blocked context onlyApr 8, 2026
- Scientific Knowledge-driven Decoding Constraints Improving the Reliability of LLMsarxiv-2604.06603 Sparse Blocked context onlyApr 8, 2026
- LLM-based Schema-Guided Extraction and Validation of Missing-Person Intelligence from Heterogeneous Data Sourcesarxiv-2604.06571 Sparse Blocked context onlyApr 8, 2026
- CCD-CBT: Multi-Agent Therapeutic Interaction for CBT Guided by Cognitive Conceptualization Diagramarxiv-2604.06551 Sparse Blocked context onlyApr 8, 2026
- Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learningarxiv-2605.02913 Sparse Blocked context onlyApr 8, 2026
- MedConclusion: A Benchmark for Biomedical Conclusion Generation from Structured Abstractsarxiv-2604.06505 Sparse Blocked context onlyApr 7, 2026
- Closing the Speech-Text Gap with Limited Audio for Effective Domain Adaptation in LLM-Based ASRarxiv-2604.06487 Sparse Blocked context onlyApr 7, 2026
- ValueGround: Evaluating Culture-Conditioned Visual Value Grounding in MLLMsarxiv-2604.06484 Sparse Blocked context onlyApr 7, 2026
- DataSTORM: Deep Research on Large-Scale Databases using Exploratory Data Analysis and Data Storytellingarxiv-2604.06474 Sparse Blocked context onlyApr 7, 2026
- Multi-objective Evolutionary Merging Enables Efficient Reasoning Modelsarxiv-2604.06465 Sparse Blocked context onlyApr 7, 2026
- Context-Aware Dialectal Arabic Machine Translation with Interactive Region and Register Selectionarxiv-2604.06456 Sparse Blocked context onlyApr 7, 2026
- Learning to Interrupt in Language-based Multi-agent Communicationarxiv-2604.06452 Sparse Blocked context onlyApr 7, 2026
- The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planningarxiv-2604.06427 Sparse Blocked context onlyApr 7, 2026
- State-of-the-Art Arabic Language Modeling with Sparse MoE Fine-Tuning and Chain-of-Thought Distillationarxiv-2604.06421 Sparse Blocked context onlyApr 7, 2026
- Attention Flows: Tracing LLM Conceptual Engagement via Story Summariesarxiv-2604.06416 Sparse Blocked context onlyApr 7, 2026
- Application-Driven Pedagogical Knowledge Optimization of Open-Source LLMs via Reinforcement Learning and Supervised Fine-Tuningarxiv-2604.06385 Sparse Blocked context onlyApr 7, 2026
- STDec: Spatio-Temporal Stability Guided Decoding for dLLMsarxiv-2604.06330 Sparse Blocked context onlyApr 7, 2026
- Paper Circle: An Open-source Multi-agent Research Discovery and Analysis Frameworkarxiv-2604.06170 Sparse Blocked context onlyApr 7, 2026
- Toward Consistent World Models with Multi-Token Prediction and Latent Semantic Enhancementarxiv-2604.06155 Sparse Blocked context onlyApr 7, 2026
- Social Dynamics as Critical Vulnerabilities that Undermine Objective Decision-Making in LLM Collectivesarxiv-2604.06091 Sparse Blocked context onlyApr 7, 2026
- Stories of Your Life as Others: A Round-Trip Evaluation of LLM-Generated Life Stories Conditioned on Rich Psychometric Profilesarxiv-2604.06071 Sparse Blocked context onlyApr 7, 2026
- Short Data, Long Context: Distilling Positional Knowledge in Transformersarxiv-2604.06070 Sparse Blocked context onlyApr 7, 2026
- From Hallucination to Structure Snowballing: The Alignment Tax of Constrained Decoding in LLM Reflectionarxiv-2604.06066 Sparse Blocked context onlyApr 7, 2026
- BiMind: A Dual-Head Reasoning Model with Attention-Geometry Adapter for Incorrect Information Detectionarxiv-2604.06022 Sparse Blocked context onlyApr 7, 2026
- Is CLIP Cross-Eyed? Revealing and Mitigating Center Bias in the CLIP Familyarxiv-2604.05971 Sparse Blocked context onlyApr 7, 2026
- FinReporting: An Agentic Workflow for Localized Reporting of Cross-Jurisdiction Financial Disclosuresarxiv-2604.05966 Direct Blocked context onlyApr 7, 2026
- "I See What You Did There": Can Large Vision-Language Models Understand Multimodal Puns?arxiv-2604.05930 Sparse Blocked context onlyApr 7, 2026
- Mechanistic Circuit-Based Knowledge Editing in Large Language Modelsarxiv-2604.05876 Sparse Blocked context onlyApr 7, 2026
- Evaluating Learner Representations for Differentiation Prior to Instructional Outcomesarxiv-2604.05848 Sparse Blocked context onlyApr 7, 2026
- AgentGL: Towards Agentic Graph Learning with LLMs via Reinforcement Learningarxiv-2604.05846 Sparse Blocked context onlyApr 7, 2026
- WikiSeeker: Rethinking the Role of Vision-Language Models in Knowledge-Based Visual Question Answeringarxiv-2604.05818 Sparse Blocked context onlyApr 7, 2026
- Measuring What Matters!! Assessing Therapeutic Principles in Mental-Health Conversationarxiv-2604.05795 Sparse Blocked context onlyApr 7, 2026
- What Models Know, How Well They Know It: Knowledge-Weighted Fine-Tuning for Learning When to Say "I Don't Know"arxiv-2604.05779 Sparse Blocked context onlyApr 7, 2026
- Beyond the Beep: Scalable Collision Anticipation and Real-Time Explainability with BADAS-2.0arxiv-2604.05767 Sparse Blocked context onlyApr 7, 2026
- Identifying Influential N-grams in Confidence Calibration via Regression Analysisarxiv-2604.05757 Sparse Blocked context onlyApr 7, 2026
- Controlling Distributional Bias in Multi-Round LLM Generation via KL-Optimized Fine-Tuningarxiv-2604.05756 Sparse Blocked context onlyApr 7, 2026
- Can Large Language Models Reinvent Foundational Algorithms?arxiv-2604.05716 Sparse Blocked context onlyApr 7, 2026
- Attention Editing: A Versatile Framework for Cross-Architecture Attention Conversionarxiv-2604.05688 Sparse Blocked context onlyApr 7, 2026
- LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signalsarxiv-2604.05655 Sparse Blocked context onlyApr 7, 2026
- See the Forest for the Trees: Loosely Speculative Decoding via Visual-Semantic Guidance for Efficient Inference of Video LLMsarxiv-2604.05650 Sparse Blocked context onlyApr 7, 2026
- Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judgearxiv-2604.05593 Sparse Blocked context onlyApr 7, 2026
- Weakly Supervised Distillation of Hallucination Signals into Transformer Representationsarxiv-2604.06277 Sparse Blocked context onlyApr 7, 2026
- Spec Kit Agents: Context-Grounded Agentic Workflowsarxiv-2604.05278 Sparse Blocked context onlyApr 7, 2026
- Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NERarxiv-2604.05158 Sparse Blocked context onlyApr 6, 2026
- TriAttention: Efficient Long Reasoning with Trigonometric KV Compressionarxiv-2604.04921 Sparse Blocked context onlyApr 6, 2026
- Vero: An Open RL Recipe for General Visual Reasoningarxiv-2604.04917 Sparse Blocked context onlyApr 6, 2026
- HI-MoE: Hierarchical Instance-Conditioned Mixture-of-Experts for Object Detectionarxiv-2604.04908 Sparse Blocked context onlyApr 6, 2026
- Rethinking Exploration in RLVR: From Entropy Regularization to Refinement via Bidirectional Entropy Modulationarxiv-2604.04894 Sparse Blocked context onlyApr 6, 2026
- Synthetic Sandbox for Training Machine Learning Engineering Agentsarxiv-2604.04872 Sparse Blocked context onlyApr 6, 2026
- Optimizing LLM Prompt Engineering with DSPy Based Declarative Learningarxiv-2604.04869 Sparse Blocked context onlyApr 6, 2026
- MemMachine: A Ground-Truth-Preserving Memory System for Personalized AI Agentsarxiv-2604.04853 Sparse Blocked context onlyApr 6, 2026
- Full-Duplex-Bench-v3: Benchmarking Tool Use for Full-Duplex Voice Agents Under Real-World Disfluencyarxiv-2604.04847 Sparse Blocked context onlyApr 6, 2026
- InfBaGel: Human-Object-Scene Interaction Generation with Dynamic Perception and Iterative Refinementarxiv-2604.04843 Sparse Blocked context onlyApr 6, 2026
- Do No Harm: Exposing Hidden Vulnerabilities of LLMs via Persona-based Client Simulation Attack in Psychological Counselingarxiv-2604.04842 Sparse Blocked context onlyApr 6, 2026
- MERIT: Multilingual Expert-Reward Informed Tuning for Chinese-Centric Low-Resource Machine Translationarxiv-2604.04839 Sparse Blocked context onlyApr 6, 2026
- CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposingarxiv-2605.02910 Sparse Blocked context onlyApr 6, 2026
- Plausibility as Commonsense Reasoning: Humans Succeed, Large Language Models Do notarxiv-2604.04825 Sparse Blocked context onlyApr 6, 2026
- ANX: Protocol-First Design for AI Agent Interaction with a Supporting 3EX Decoupled Architecturearxiv-2604.04820 Sparse Blocked context onlyApr 6, 2026
- LiveFact: A Dynamic, Time-Aware Benchmark for LLM-Driven Fake News Detectionarxiv-2604.04815 Sparse Blocked context onlyApr 6, 2026
- SkillX: Automatically Constructing Skill Knowledge Bases for Agentsarxiv-2604.04804 Sparse Blocked context onlyApr 6, 2026
- How Far Are We? Systematic Evaluation of LLMs vs. Human Experts in Mathematical Contest in Modelingarxiv-2604.04791 Sparse Blocked context onlyApr 6, 2026
- Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problemsarxiv-2604.04767 Sparse Blocked context onlyApr 6, 2026
- Your Agent, Their Asset: A Real-World Safety Analysis of OpenClawarxiv-2604.04759 Sparse Blocked context onlyApr 6, 2026
- AI Trust OS -- A Continuous Governance Framework for Autonomous AI Observability and Zero-Trust Compliance in Enterprise Environmentsarxiv-2604.04749 Sparse Blocked context onlyApr 6, 2026
- Lighting Up or Dimming Down? Exploring Dark Patterns of LLMs in Co-Creativityarxiv-2604.04735 Sparse Blocked context onlyApr 6, 2026
- Discovering Failure Modes in Vision-Language Models using RLarxiv-2604.04733 Sparse Blocked context onlyApr 6, 2026
- Metaphors We Compute By: A Computational Audit of Cultural Translation vs. Thinking in LLMsarxiv-2604.04732 Sparse Blocked context onlyApr 6, 2026
- Individual and Combined Effects of English as a Second Language and Typos on LLM Performancearxiv-2604.04723 Sparse Blocked context onlyApr 6, 2026
- Is a Picture Worth a Thousand Words? Adaptive Multimodal Fact-Checking with Visual Evidence Necessityarxiv-2604.04692 Sparse Blocked context onlyApr 6, 2026
- ROSClaw: A Hierarchical Semantic-Physical Framework for Heterogeneous Multi-Agent Collaborationarxiv-2604.04664 Direct Blocked context onlyApr 6, 2026
- Search, Do not Guess: Teaching Small Language Models to Be Effective Search Agentsarxiv-2604.04651 Sparse Blocked context onlyApr 6, 2026
- SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systemsarxiv-2604.04514 Direct Blocked context onlyApr 6, 2026
- One Model for All: Multi-Objective Controllable Language Modelsarxiv-2604.04497 Sparse Blocked context onlyApr 6, 2026
- A Patch-based Cross-view Regularized Framework for Backdoor Defense in Multimodal Large Language Modelsarxiv-2604.04488 Sparse Blocked context onlyApr 6, 2026
- Grid2Matrix: Revealing Digital Agnosia in Vision-Language Modelsarxiv-2604.09687 Sparse Blocked context onlyApr 6, 2026
- Explainable Autonomous Cyber Defense using Adversarial Multi-Agent Reinforcement Learningarxiv-2604.04442 Sparse Blocked context onlyApr 6, 2026
- STEER: Structured Event Evidence for Video Reasoning via Multi-Objective Reinforcement Learningarxiv-2604.04415 Sparse Blocked context onlyApr 6, 2026
- How Alignment Routes: Localizing, Scaling, and Controlling Policy Circuits in Language Modelsarxiv-2604.04385 Sparse Blocked context onlyApr 6, 2026
- REAM: Merging Improves Pruning of Experts in LLMsarxiv-2604.04356 Sparse Blocked context onlyApr 6, 2026
- Self-Distilled RLVRarxiv-2604.03128 Sparse Blocked context onlyApr 3, 2026
- SkVM: Compiling Skills for Efficient Execution Everywherearxiv-2604.03088 Direct Blocked context onlyApr 3, 2026
- JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiencyarxiv-2604.03044 Curated Related Blocked context onlyApr 3, 2026
- OmniGUI: Benchmarking GUI Agents in Omni-Modal Smartphone Environmentsarxiv-2605.18758 Sparse Blocked context onlyApr 3, 2026
- Mitigating LLM biases toward spurious social contexts using direct preference optimizationarxiv-2604.02585 Sparse Blocked context onlyApr 2, 2026
- ActionParty: Multi-Subject Action Binding in Generative Video Gamesarxiv-2604.02330 Sparse Blocked context onlyApr 2, 2026
- Steerable Visual Representationsarxiv-2604.02327 Sparse Blocked context onlyApr 2, 2026
- Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoningarxiv-2604.02322 Sparse Blocked context onlyApr 2, 2026
- Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Modelsarxiv-2604.02315 Sparse Blocked context onlyApr 2, 2026
- VOID: Video Object and Interaction Deletionarxiv-2604.02296 Sparse Blocked context onlyApr 2, 2026
- Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generationarxiv-2604.02289 Sparse Blocked context onlyApr 2, 2026
- Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routingarxiv-2604.02288 Sparse Blocked context onlyApr 2, 2026
- Novel Memory Forgetting Techniques for Autonomous AI Agents: Balancing Relevance and Efficiencyarxiv-2604.02280 Sparse Blocked context onlyApr 2, 2026
- Crystalite: A Lightweight Transformer for Efficient Crystal Modelingarxiv-2604.02270 Sparse Blocked context onlyApr 2, 2026
- Answering the Wrong Question: Reasoning Trace Inversion for Abstention in LLMsarxiv-2604.02230 Sparse Blocked context onlyApr 2, 2026
- When to ASK: Uncertainty-Gated Language Assistance for Reinforcement Learningarxiv-2604.02226 Sparse Blocked context onlyApr 2, 2026
- Impact of Multimodal and Conversational AI on Learning Outcomes and Experiencearxiv-2604.02221 Sparse Blocked context onlyApr 2, 2026
- Towards Position-Robust Talent Recommendation via Large Language Modelsarxiv-2604.02200 Sparse Blocked context onlyApr 2, 2026
- Neuro-RIT: Neuron-Guided Instruction Tuning for Robust Retrieval-Augmented Language Modelarxiv-2604.02194 Sparse Blocked context onlyApr 2, 2026
- The Expert Strikes Back: Interpreting Mixture-of-Experts Language Models at Expert Levelarxiv-2604.02178 Sparse Blocked context onlyApr 2, 2026
- Adam's Law: Textual Frequency Law on Large Language Modelsarxiv-2604.02176 Sparse Blocked context onlyApr 2, 2026
- Quantifying Self-Preservation Bias in Large Language Modelsarxiv-2604.02174 Sparse Blocked context onlyApr 2, 2026
- MTI: A Behavior-Based Temperament Profiling System for AI Agentsarxiv-2604.02145 Sparse Blocked context onlyApr 2, 2026
- LLM-as-a-Judge for Time Series Explanationsarxiv-2604.02118 Sparse Blocked context onlyApr 2, 2026
- Reliable Control-Point Selection for Steering Reasoning in Large Language Modelsarxiv-2604.02113 Sparse Blocked context onlyApr 2, 2026
- Optimizing RAG Rerankers with LLM Feedback via Reinforcement Learningarxiv-2604.02091 Sparse Blocked context onlyApr 2, 2026
- Diff-KD: Diffusion-based Knowledge Distillation for Collaborative Perception under Corruptionsarxiv-2604.02061 Sparse Blocked context onlyApr 2, 2026
- ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safetyarxiv-2604.02022 Direct Blocked context onlyApr 2, 2026
- $k$NNProxy: Efficient Training-Free Proxy Alignment for Black-Box Zero-Shot LLM-Generated Text Detectionarxiv-2604.02008 Sparse Blocked context onlyApr 2, 2026
- ProCeedRL: Process Critic with Exploratory Demonstration Reinforcement Learning for LLM Agentic Reasoningarxiv-2604.02006 Sparse Blocked context onlyApr 2, 2026
- SAFE: Stepwise Atomic Feedback for Error correction in Multi-hop Reasoningarxiv-2604.01993 Sparse Blocked context onlyApr 2, 2026
- RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scalearxiv-2604.01977 Sparse Blocked context onlyApr 2, 2026
- Ego-Grounding for Personalized Question-Answering in Egocentric Videosarxiv-2604.01966 Sparse Blocked context onlyApr 2, 2026
- Lifting Unlabeled Internet-level Data for 3D Scene Understandingarxiv-2604.01907 Sparse Blocked context onlyApr 2, 2026
- PLOT: Enhancing Preference Learning via Optimal Transportarxiv-2604.01837 Sparse Blocked context onlyApr 2, 2026
- DEFT: Distribution-guided Efficient Fine-Tuning for Human Alignmentarxiv-2604.01787 Sparse Blocked context onlyApr 2, 2026
- DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planningarxiv-2604.01765 Sparse Blocked context onlyApr 2, 2026
- FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Modelsarxiv-2604.01762 Sparse Blocked context onlyApr 2, 2026
- LiveMathematicianBench: A Live Benchmark for Mathematician-Level Reasoning with Proof Sketchesarxiv-2604.01754 Sparse Blocked context onlyApr 2, 2026
- LiteInception: A Lightweight and Interpretable Deep Learning Framework for General Aviation Fault Diagnosisarxiv-2604.01725 Sparse Blocked context onlyApr 2, 2026
- Human-Guided Reasoning with Large Language Models for Vietnamese Speech Emotion Recognitionarxiv-2604.01711 Sparse Blocked context onlyApr 2, 2026
- Memory in the LLM Era: Modular Architectures and Strategies in a Unified Frameworkarxiv-2604.01707 Sparse Blocked context onlyApr 2, 2026
- Fragile Reasoning: A Mechanistic Analysis of LLM Sensitivity to Meaning-Preserving Perturbationsarxiv-2604.01639 Sparse Blocked context onlyApr 2, 2026
- OSCAR: Orchestrated Self-verification and Cross-path Refinementarxiv-2604.01624 Sparse Blocked context onlyApr 2, 2026
- Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Modelsarxiv-2604.01622 Sparse Blocked context onlyApr 2, 2026
- DeltaMem: Towards Agentic Memory Management via Reinforcement Learningarxiv-2604.01560 Sparse Blocked context onlyApr 2, 2026
- Read More, Think More: Revisiting Observation Reduction for Web Agentsarxiv-2604.01535 Sparse Blocked context onlyApr 2, 2026
- Magic, Madness, Heaven, Sin: LLM Output Diversity is Everything, Everywhere, All at Oncearxiv-2604.01504 Sparse Blocked context onlyApr 2, 2026
- From SWE-ZERO to SWE-HERO: Execution-free to Execution-based Fine-tuning for Software Engineering Agentsarxiv-2604.01496 Sparse Blocked context onlyApr 2, 2026
- AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networksarxiv-2604.01487 Sparse Blocked context onlyApr 1, 2026
- When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signalsarxiv-2604.01476 Sparse Blocked context onlyApr 1, 2026
- Adaptive Stopping for Multi-Turn LLM Reasoningarxiv-2604.01413 Sparse Blocked context onlyApr 1, 2026
- Procedural Knowledge at Scale Improves Reasoningarxiv-2604.01348 Sparse Blocked context onlyApr 1, 2026
- Preference learning in shades of gray: Interpretable and bias-aware reward modeling for human preferencesarxiv-2604.01312 Sparse Blocked context onlyApr 1, 2026
- M2-Verify: A Large-Scale Multidomain Benchmark for Checking Multimodal Claim Consistencyarxiv-2604.01306 Sparse Blocked context onlyApr 1, 2026
- Scaling Reasoning Tokens via RL and Parallel Thinking: Evidence From Competitive Programmingarxiv-2604.01302 Sparse Blocked context onlyApr 1, 2026
- HippoCamp: Benchmarking Contextual Agents on Personal Computersarxiv-2604.01221 Sparse Blocked context onlyApr 1, 2026
- Universal YOCO for Efficient Depth Scalingarxiv-2604.01220 Sparse Blocked context onlyApr 1, 2026
- $\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Executionarxiv-2604.01212 Sparse Blocked context onlyApr 1, 2026
- LLM REgression with a Latent Iterative State Headarxiv-2604.01206 Sparse Blocked context onlyApr 1, 2026
- Embarrassingly Simple Self-Distillation Improves Code Generationarxiv-2604.01193 Sparse Blocked context onlyApr 1, 2026
- True (VIS) Lies: Analyzing How Generative AI Recognizes Intentionality, Rhetoric, and Misleadingness in Visualization Liesarxiv-2604.01181 Sparse Blocked context onlyApr 1, 2026
- Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoningarxiv-2604.01170 Sparse Blocked context onlyApr 1, 2026
- Brainstacks: Cross-Domain Cognitive Capabilities via Frozen MoE-LoRA Stacks for Continual LLM Learningarxiv-2604.01152 Sparse Blocked context onlyApr 1, 2026
- CARE: Privacy-Compliant Agentic Reasoning with Evidence Discordancearxiv-2604.01113 Sparse Blocked context onlyApr 1, 2026
- Uncertainty-Aware Variational Reward Factorization via Probabilistic Preference Bases for LLM Personalizationarxiv-2604.00997 Sparse Blocked context onlyApr 1, 2026
- Multimodal Analysis of State-Funded News Coverage of the Israel-Hamas War on YouTube Shortsarxiv-2604.00994 Sparse Blocked context onlyApr 1, 2026
- Dual Optimal: Make Your LLM Peer-like with Dignityarxiv-2604.00979 Sparse Blocked context onlyApr 1, 2026
- Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignmentarxiv-2604.00913 Sparse Blocked context onlyApr 1, 2026
- Beyond Symbolic Solving: Multi Chain-of-Thought Voting for Geometric Reasoning in Large Language Modelsarxiv-2604.00890 Sparse Blocked context onlyApr 1, 2026
- PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Codingarxiv-2604.00886 Sparse Blocked context onlyApr 1, 2026
- KUET at StanceNakba Shared Task: StanceMoE: Mixture-of-Experts Architecture for Stance Detectionarxiv-2604.00878 Sparse Blocked context onlyApr 1, 2026
- Agentic Tool Use in Large Language Modelsarxiv-2604.00835 Curated Related Blocked context onlyApr 1, 2026
- Multimodal Language Models Cannot Spot Spatial Inconsistenciesarxiv-2604.00799 Sparse Blocked context onlyApr 1, 2026
- LangMARL: Natural Language Multi-Agent Reinforcement Learningarxiv-2604.00722 Sparse Blocked context onlyApr 1, 2026
- TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Modelsarxiv-2604.00666 Sparse Blocked context onlyApr 1, 2026
- A Survey of On-Policy Distillation for Large Language Modelsarxiv-2604.00626 Sparse Blocked context onlyApr 1, 2026
- Speech LLMs are Contextual Reasoning Transcribersarxiv-2604.00610 Sparse Blocked context onlyApr 1, 2026
- More Human, More Efficient: Aligning Annotations with Quantized SLMsarxiv-2604.00586 Sparse Blocked context onlyApr 1, 2026
- Ontology-Constrained Neural Reasoning in Enterprise Agentic Systems: A Neurosymbolic Architecture for Domain-Grounded AI Agentsarxiv-2604.00555 Sparse Blocked context onlyApr 1, 2026
- MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understandingarxiv-2604.00513 Sparse Blocked context onlyApr 1, 2026
- Adapting Text LLMs to Speech via Multimodal Depth Up-Scalingarxiv-2604.00489 Sparse Blocked context onlyApr 1, 2026
- Execution-Verified Reinforcement Learning for Optimization Modelingarxiv-2604.00442 Sparse Blocked context onlyApr 1, 2026
- TR-ICRL: Test-Time Rethinking for In-Context Reinforcement Learningarxiv-2604.00438 Sparse Blocked context onlyApr 1, 2026
- Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Modelsarxiv-2604.00375 Sparse Blocked context onlyApr 1, 2026
- Signals: Trajectory Sampling and Triage for Agentic Interactionsarxiv-2604.00356 Sparse Blocked context onlyApr 1, 2026
- Agent Q-Mix: Selecting the Right Action for LLM Multi-Agent Systems through Reinforcement Learningarxiv-2604.00344 Sparse Blocked context onlyApr 1, 2026
- Large Language Models in the Abuse Detection Pipelinearxiv-2604.00323 Sparse Blocked context onlyMar 31, 2026
- Can Large Language Models Self-Correct in Medical Question Answering? An Exploratory Studyarxiv-2604.00261 Curated Related Blocked context onlyMar 31, 2026
- A Taxonomy of Programming Languages for Code Generationarxiv-2604.00239 Sparse Blocked context onlyMar 31, 2026
- Do LLMs Know What Is Private Internally? Probing and Steering Contextual Privacy Norms in Large Language Model Representationsarxiv-2604.00209 Sparse Blocked context onlyMar 31, 2026
- Hierarchical Chain-of-Thought Prompting: Enhancing LLM Reasoning Performance and Efficiencyarxiv-2604.00130 Sparse Blocked context onlyMar 31, 2026
- Hierarchical Pre-Training of Vision Encoders with Large Language Modelsarxiv-2604.00086 Sparse Blocked context onlyMar 31, 2026
- One Panel Does Not Fit All: Case-Adaptive Multi-Agent Deliberation for Clinical Predictionarxiv-2604.00085 Sparse Blocked context onlyMar 31, 2026
- Cognitive Friction: A Decision-Theoretic Framework for Bounded Deliberation in Tool-Using Agentsarxiv-2603.30031 Sparse Blocked context onlyMar 31, 2026
- Less Is More? Selective Visual Attention to High-Importance Regions for Multimodal Radiology Summarizationarxiv-2603.29901 Sparse Blocked context onlyMar 31, 2026
- Training-Free Dynamic Upcycling of Expert Language Modelsarxiv-2603.29765 Sparse Blocked context onlyMar 31, 2026
- A Comprehensive Information-Decomposition Analysis of Large Vision-Language Modelsarxiv-2603.29676 Sparse Blocked context onlyMar 31, 2026
- Agenda-based Narrative Extraction: Steering Pathfinding Algorithms with Large Language Modelsarxiv-2603.29661 Sparse Blocked context onlyMar 31, 2026
- Learning Diagnostic Reasoning for Decision Support in Toxicologyarxiv-2603.29608 Sparse Blocked context onlyMar 31, 2026
- LLM Probe: Evaluating LLMs for Low-Resource Languagesarxiv-2603.29517 Sparse Blocked context onlyMar 31, 2026
- MemFactory: Unified Inference & Training Framework for Agent Memoryarxiv-2603.29493 Sparse Blocked context onlyMar 31, 2026
- M-MiniGPT4: Multilingual VLLM Alignment via Translated Dataarxiv-2603.29467 Sparse Blocked context onlyMar 31, 2026
- ELT-Bench-Verified: Benchmark Quality Issues Underestimate AI Agent Capabilitiesarxiv-2603.29399 Sparse Blocked context onlyMar 31, 2026
- Beyond Idealized Patients: Evaluating LLMs under Challenging Patient Behaviors in Medical Consultationsarxiv-2603.29373 Sparse Blocked context onlyMar 31, 2026
- Aligning Multimodal Sequential Recommendations via Robust Direct Preference Optimization with Sparse MoEarxiv-2603.29259 Sparse Blocked context onlyMar 31, 2026
- MemRerank: Preference Memory for Personalized Product Rerankingarxiv-2603.29247 Sparse Blocked context onlyMar 31, 2026
- Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMsarxiv-2603.29232 Sparse Blocked context onlyMar 31, 2026
- SyriSign: A Parallel Corpus for Arabic Text to Syrian Arabic Sign Language Translationarxiv-2603.29219 Sparse Blocked context onlyMar 31, 2026
- Xuanwu: Evolving General Multimodal Models into an Industrial-Grade Foundation for Content Ecosystemsarxiv-2603.29211 Sparse Blocked context onlyMar 31, 2026
- Dual Perspectives in Emotion Attribution: A Generator-Interpreter Framework for Cross-Cultural Analysis of Emotion in LLMsarxiv-2603.29077 Sparse Blocked context onlyMar 30, 2026
- Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuningarxiv-2603.29038 Sparse Blocked context onlyMar 30, 2026
- The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoningarxiv-2603.29025 Sparse Blocked context onlyMar 30, 2026
- Human-Like Lifelong Memory: A Neuroscience-Grounded Architecture for Infinite Interactionarxiv-2603.29023 Sparse Blocked context onlyMar 30, 2026
- SOLE-R1: Video-Language Reasoning as the Sole Reward for On-Robot Reinforcement Learningarxiv-2603.28730 Sparse Blocked context onlyMar 30, 2026
- Seeing with You: Perception-Reasoning Coevolution for Multimodal Reasoningarxiv-2603.28618 Sparse Blocked context onlyMar 30, 2026
- ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoningarxiv-2603.28610 Sparse Blocked context onlyMar 30, 2026
- Moving Beyond Review: Applying Language Models to Planning and Translation in Reflectionarxiv-2603.28596 Sparse Blocked context onlyMar 30, 2026
- Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verificationarxiv-2603.28488 Sparse Blocked context onlyMar 30, 2026
- Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimizationarxiv-2603.28342 Sparse Blocked context onlyMar 30, 2026
- DongYuan: An LLM-Based Framework for Integrative Chinese and Western Medicine Spleen-Stomach Disorders Diagnosisarxiv-2603.28191 Sparse Blocked context onlyMar 30, 2026
- MOSS-VoiceGenerator: Create Realistic Voices with Natural Language Descriptionsarxiv-2603.28086 Sparse Blocked context onlyMar 30, 2026
- Efficient Inference of Large Vision Language Modelsarxiv-2603.27960 Sparse Blocked context onlyMar 30, 2026
- Top-down string-to-dependency Neural Machine Translationarxiv-2603.27938 Sparse Blocked context onlyMar 30, 2026
- ATLAS-RTC: Closing the Loop on LLM Agent Output with Token-Level Runtime Controlarxiv-2603.27905 Sparse Blocked context onlyMar 29, 2026
- EffiSkill: Agent Skill Based Automated Code Efficiency Optimizationarxiv-2603.27850 Sparse Blocked context onlyMar 29, 2026
- Model Capability Dominates: Inference-Time Optimization Lessons from AIMO 3arxiv-2603.27844 Sparse Blocked context onlyMar 29, 2026
- Q-Bridge: Code Translation for Quantum Machine Learning via LLMsarxiv-2603.27836 Sparse Blocked context onlyMar 29, 2026
- Improving Clinical Diagnosis with Counterfactual Multi-Agent Reasoningarxiv-2603.27820 Sparse Blocked context onlyMar 29, 2026
- KVSculpt: KV Cache Compression as Distillationarxiv-2603.27819 Sparse Blocked context onlyMar 29, 2026
- Conversational Agents and the Understanding of Human Language: Reflections on AI, LLMs, and Cognitive Sciencearxiv-2603.27809 Sparse Blocked context onlyMar 29, 2026
- Understanding Teacher Revisions of Large Language Model-Generated Feedbackarxiv-2603.27806 Sparse Blocked context onlyMar 29, 2026
- Emergent Social Intelligence Risks in Generative Multi-Agent Systemsarxiv-2603.27771 Sparse Blocked context onlyMar 29, 2026
- KAT-Coder-V2 Technical Reportarxiv-2603.27703 Sparse Blocked context onlyMar 29, 2026
- Can Large Language Models Simulate Human Cognition Beyond Behavioral Imitation?arxiv-2603.27694 Sparse Blocked context onlyMar 29, 2026
- PRBench: End-to-end Paper Reproduction in Physics Researcharxiv-2603.27646 Sparse Blocked context onlyMar 29, 2026
- LongCat-Next: Lexicalizing Modalities as Discrete Tokensarxiv-2603.27538 Direct Blocked context onlyMar 29, 2026
- Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Modelsarxiv-2603.27522 Sparse Blocked context onlyMar 29, 2026
- AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agentsarxiv-2603.27490 Sparse Blocked context onlyMar 29, 2026
- Multi-Agent Dialectical Refinement for Enhanced Argument Classificationarxiv-2603.27451 Sparse Blocked context onlyMar 29, 2026
- Improving Attributed Long-form Question Answering with Intent Awarenessarxiv-2603.27435 Sparse Blocked context onlyMar 28, 2026
- The Geometry of Harmful Intent: Training-Free Anomaly Detection via Angular Deviation in LLM Residual Streamsarxiv-2603.27412 Sparse Blocked context onlyMar 28, 2026
- Heterogeneous Debate Engine: Identity-Grounded Cognitive Architecture for Resilient LLM-Based Ethical Tutoringarxiv-2603.27404 Sparse Blocked context onlyMar 28, 2026
- PubMed Reasoner: Dynamic Reasoning-based Retrieval for Evidence-Grounded Biomedical Question Answeringarxiv-2603.27335 Sparse Blocked context onlyMar 28, 2026
- SACRED: A Faithful Annotated Multimedia Multimodal Multilingual Dataset for Classifying Connectedness Types in Online Spiritualityarxiv-2603.27331 Sparse Blocked context onlyMar 28, 2026
- Mitigating Hallucination on Hallucination in RAG via Ensemble Votingarxiv-2603.27253 Sparse Blocked context onlyMar 28, 2026
- Rethinking Easy-to-Hard: Limits of Curriculum Learning in Post-Training for Deductive Reasoningarxiv-2603.27226 Sparse Blocked context onlyMar 28, 2026
- daVinci-LLM:Towards the Science of Pretrainingarxiv-2603.27164 Curated Related Blocked context onlyMar 28, 2026
- Learning to Predict Future-Aligned Research Proposals with Language Modelsarxiv-2603.27146 Sparse Blocked context onlyMar 28, 2026
- ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understandingarxiv-2603.27064 Direct Blocked context onlyMar 28, 2026
- Debiasing Large Language Models toward Social Factors in Online Behavior Analytics through Prompt Knowledge Tuningarxiv-2603.27057 Sparse Blocked context onlyMar 28, 2026
- The Last Fingerprint: How Markdown Training Shapes LLM Prosearxiv-2603.27006 Sparse Blocked context onlyMar 27, 2026
- Learning to Commit: Generating Organic Pull Requests via Online Repository Memoryarxiv-2603.26664 Sparse Blocked context onlyMar 27, 2026
- PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoningarxiv-2603.26653 Sparse Blocked context onlyMar 27, 2026
- When Perplexity Lies: Generation-Focused Distillation of Hybrid Sequence Modelsarxiv-2603.26556 Sparse Blocked context onlyMar 27, 2026
- JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systemsarxiv-2603.26515 Sparse Blocked context onlyMar 27, 2026
- ClimateCheck 2026: Scientific Fact-Checking and Disinformation Narrative Classification of Climate-related Claimsarxiv-2603.26449 Sparse Blocked context onlyMar 27, 2026
- CALRK-Bench: Evaluating Context-Aware Legal Reasoning in Korean Lawarxiv-2603.26332 Sparse Blocked context onlyMar 27, 2026
- From Human Cognition to Neural Activations: Probing the Computational Primitives of Spatial Reasoning in LLMsarxiv-2603.26323 Sparse Blocked context onlyMar 27, 2026
- Xpertbench: Expert Level Tasks with Rubrics-Based Evaluationarxiv-2604.02368 Sparse Blocked context onlyMar 27, 2026
- Ask or Assume? Uncertainty-Aware Clarification-Seeking in Coding Agentsarxiv-2603.26233 Sparse Blocked context onlyMar 27, 2026
- DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Modelsarxiv-2603.26164 Direct Blocked context onlyMar 27, 2026
- Vega: Learning to Drive with Natural Language Instructionsarxiv-2603.25741 Sparse Blocked context onlyMar 26, 2026
- PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inferencearxiv-2603.25730 Sparse Blocked context onlyMar 26, 2026
- R-C2: Cycle-Consistent Reinforcement Learning Improves Multimodal Reasoningarxiv-2603.25720 Sparse Blocked context onlyMar 26, 2026
- Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Modelsarxiv-2603.25716 Direct Blocked context onlyMar 26, 2026
- S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculationarxiv-2603.25702 Sparse Blocked context onlyMar 26, 2026
- Self-Improvement of Large Language Models: A Technical Overview and Future Outlookarxiv-2603.25681 Sparse Blocked context onlyMar 26, 2026
- Is Mathematical Problem-Solving Expertise in Large Language Models Associated with Assessment Performance?arxiv-2603.25633 Sparse Blocked context onlyMar 26, 2026
- PICon: A Multi-Turn Interrogation Framework for Evaluating Persona Agent Consistencyarxiv-2603.25620 Sparse Blocked context onlyMar 26, 2026
- Demographic Fairness in Multimodal LLMs: A Benchmark of Gender and Ethnicity Bias in Face Verificationarxiv-2603.25613 Sparse Blocked context onlyMar 26, 2026
- Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixesarxiv-2603.25562 Sparse Blocked context onlyMar 26, 2026
- EcoThink: A Green Adaptive Inference Framework for Sustainable and Accessible Agentsarxiv-2603.25498 Sparse Blocked context onlyMar 26, 2026
- Cross-Model Disagreement as a Label-Free Correctness Signalarxiv-2603.25450 Sparse Blocked context onlyMar 26, 2026
- From Manipulation to Mistrust: Explaining Diverse Micro-Video Misinformation for Robust Debunking in the Wildarxiv-2603.25423 Sparse Blocked context onlyMar 26, 2026
- Navigating the Prompt Space: Improving LLM Classification of Social Science Texts Through Prompt Engineeringarxiv-2603.25422 Sparse Blocked context onlyMar 26, 2026
- TAPO: Translation Augmented Policy Optimization for Multilingual Mathematical Reasoningarxiv-2603.25419 Sparse Blocked context onlyMar 26, 2026
- Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Modelsarxiv-2603.25412 Sparse Blocked context onlyMar 26, 2026
- System Design for Maintaining Internal State Consistency in Long-Horizon Robotic Tabletop Gamesarxiv-2603.25405 Sparse Blocked context onlyMar 26, 2026
- Large Language Model as Token Compressor and Decompressorarxiv-2603.25340 Sparse Blocked context onlyMar 26, 2026
- AD-CARE: A Guideline-grounded, Modality-agnostic LLM Agent for Real-world Alzheimer's Disease Diagnosis with Multi-cohort Assessment, Fairness Analysis, and Reader Studyarxiv-2603.25322 Sparse Blocked context onlyMar 26, 2026
- Separate Before You Compress: The WWHO Tokenization Architecturearxiv-2603.25309 Sparse Blocked context onlyMar 26, 2026
- SliderQuant: Accurate Post-Training Quantization for LLMsarxiv-2603.25284 Sparse Blocked context onlyMar 26, 2026
- CRAFT: Grounded Multi-Agent Coordination Under Partial Informationarxiv-2603.25268 Sparse Blocked context onlyMar 26, 2026
- MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Elucidationarxiv-2603.25253 Sparse Blocked context onlyMar 26, 2026
- Probabilistic Concept Graph Reasoning for Multimodal Misinformation Detectionarxiv-2603.25203 Sparse Blocked context onlyMar 26, 2026
- Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Modelsarxiv-2603.25155 Sparse Blocked context onlyMar 26, 2026
- OMIND: Framework for Knowledge Grounded Finetuning and Multi-Turn Dialogue Benchmark for Mental Health LLMsarxiv-2603.25105 Sparse Blocked context onlyMar 26, 2026
- An Explainable Ensemble Learning Framework for Crop Classification with Optimized Feature Pyramids and Deep Networksarxiv-2603.25070 Sparse Blocked context onlyMar 26, 2026
- TopoPilot: Reliable Conversational Workflow Automation for Topological Data Analysis and Visualizationarxiv-2603.25063 Sparse Blocked context onlyMar 26, 2026
- Closing the Confidence-Faithfulness Gap in Large Language Modelsarxiv-2603.25052 Sparse Blocked context onlyMar 26, 2026