Featured Papers
Popular high-signal papers with direct links to full protocol pages.
- LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
Jun 18, 2026 · Citations: 0
Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies.
- StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs
Jun 18, 2026 · Citations: 0
Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these models judge people remain poorly understood.
- Beyond Global Replanning: Hierarchical Recovery for Cross-Device Agent Systems
Jun 18, 2026 · Citations: 0
We propose H-RePlan, a hierarchical replanning framework for multi-device agents with unified API--CLI--GUI execution.
- Your Mouse and Eyes Secretly Leak Your Preference: LLM Alignment using Implicit Feedback from Users
Jun 18, 2026 · Citations: 0
To align a Large Language Model (LLM), most existing methods collect explicit human feedback and train a reward model to predict the human preference based on the response text.
- Scalable Training of Spatially Grounded 2D Vision-Language Models for Radiology
Jun 18, 2026 · Citations: 0
On external VQA benchmarks (Slake, VQA-RAD), RadGrounder achieves competitive results with specialized medical VLMs.
- CATCH-ME if you RAG: a dataset of Contextually Annotated multi-Turn Counterspeech against Hate and Misinformation Exchanges
Jun 18, 2026 · Citations: 0
While LLMs represent a scalable solution for assisting humans in the generation of counterspeech for both threats, zero-shot models frequently generate repetitive and vague responses, underscoring the need for high-quality examples to steer…
- Token-Operations-Oriented Inference Optimization Techniques for Large Models
Jun 18, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- PsyScore: A Psychometrically-Aware Framework for Trait-Adaptive Essay Scoring and ZPD-Scaffolded Feedback
Jun 18, 2026 · Citations: 0
PsyScore comprises three key modules: a Trait-Adaptive Neural IRT Scorer that incorporates the Graded Partial Credit Model (GPCM) into a neural architecture, enabling the precise estimation of student ability while maintaining psychometric…
- The Register Gap: A Meaning Intelligence Framework for Nigerian Public Discourse
Jun 18, 2026 · Citations: 0
We introduce the Meaning Intelligence Framework (MIF), a nine-dimension annotation and evaluation schema for Nigerian public discourse that separates surface sentiment from true communicative intent.
- Actionable Activation Directions for Detecting and Mitigating Emergent Misalignment Across Language Model Families
Jun 18, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- CzechDocs: A Multiway Parallel Dataset of Formatted Documents for Minority Languages in Czechia
Jun 18, 2026 · Citations: 0
The dataset is designed to support the evaluation of machine translation systems that aim to preserve document formatting during translation.
- Apparent Psychological Profiles of Large Language Models are Largely a Measurement Artifact
Jun 18, 2026 · Citations: 0
Psychological instruments designed for humans are increasingly used to assign large language models (LLMs) stable psychological profiles that affect their usability, safety assessment, and use as proxies for human participants in research.