Featured Papers
Popular high-signal papers with direct links to full protocol pages.
- Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models
Apr 9, 2026 · Citations: 0
The advent of agentic multimodal models has empowered systems to actively interact with external environments.
- SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds
Apr 9, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.
- Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts
Apr 9, 2026 · Citations: 0
Experiments on three multimodal MoE models across six benchmarks demonstrate consistent improvements, with gains of up to 3.17% on complex visual reasoning tasks.
- RewardFlow: Generate Images by Optimizing What You Reward
Apr 9, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.
- PSI: Shared State as the Missing Layer for Coherent AI-Generated Instruments in Personal AI Agents
Apr 9, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.
- Ads in AI Chatbots? An Analysis of How Large Language Models Navigate Conflicts of Interest
Apr 9, 2026 · Citations: 0
Today's large language models (LLMs) are trained to align with user preferences through methods such as reinforcement learning.
- WebArbiter: A Principle-Guided Reasoning Process Reward Model for Web Agents
Jan 29, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.
- Quantifying Explanation Consistency: The C-Score Metric for CAM-Based Explainability in Medical Image Classification
Apr 9, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.
- The Detection-Extraction Gap: Models Know the Answer Before They Can Say It
Apr 8, 2026 · Citations: 0
Across five model configurations, two families, and three benchmarks, we find that 52--88% of chain-of-thought tokens are produced after the answer is recoverable from a partial prefix.
- SUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions
Apr 9, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.
- Faithful GRPO: Improving Visual Spatial Reasoning in Multimodal Language Models via Constrained Policy Optimization
Apr 9, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.
- TTVS: Boosting Self-Exploring Reinforcement Learning via Test-time Variational Synthesis
Apr 9, 2026 · Citations: 0
Open the paper page for extracted protocol signals, benchmark mentions, and evaluation context.