Featured Papers
Popular high-signal papers with direct links to full protocol pages.
- Tokenisation via Convex Relaxations
May 21, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Vector Policy Optimization: Training for Diversity Improves Test-Time Search
May 21, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Evaluating Commercial AI Chatbots as News Intermediaries
May 21, 2026 · Citations: 0
We present a 14-day (February 9-22, 2026) evaluation of six AI chatbots (Gemini 3 Flash and Pro, Grok 4, Claude 4.5 Sonnet, GPT-5 and GPT-4o mini) on 2,100 factual questions derived from same-day BBC News reporting across six regional…
- Reducing Political Manipulation with Consistency Training
May 21, 2026 · Citations: 0
We show that PCT preserves overall helpfulness, substantially reduces covert political bias, and generalizes to held-out benchmarks.
- Understanding Data Temporality Impact on Large Language Models Pre-training
May 21, 2026 · Citations: 0
First, we introduce a comprehensive benchmark of over 7,000 temporally grounded questions and an evaluation protocol that enables analysis of whether models correctly associate facts with their corresponding time periods.
- ChronoMedKG: A Temporally-Grounded Biomedical Knowledge Graph and Benchmark for Clinical Reasoning
May 21, 2026 · Citations: 0
The graph is constructed through a disease-autonomous multi-agent pipeline in which multiple frontier LLMs independently extract knowledge from PubMed and PMC literature.
- Beyond Acoustic Emotion Recognition: Multimodal Pathos Analysis in Political Speech Using LLM-Based and Acoustic Emotion Models
May 21, 2026 · Citations: 0
We investigate whether acoustic emotion recognition models can serve as proxies for the Pathos dimension in political speech analysis, as operationalised by the TRUST multi-agent large language model (LLM) pipeline.
- AnyMo: Geometry-Aware Setup-Agnostic Modeling of Human Motion in the Wild
May 21, 2026 · Citations: 0
As wearable and mobile devices become increasingly embedded in daily life, they offer a practical way to continuously sense human motion in the wild.
- AMEL: Accumulated Message Effects on LLM Judgments
May 21, 2026 · Citations: 0
Across 75,898 API calls to 11 models from 4 providers (OpenAI, Anthropic, Google, and four open-source models), we present identical test items in isolation or following histories saturated with predominantly positive or negative…
- Tokenization with Split Trees
May 21, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Self-Policy Distillation via Capability-Selective Subspace Projection
May 21, 2026 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Moral Semantics Survive Machine Translation: Cross-Lingual Evidence from Moral Foundations Corpora
May 21, 2026 · Citations: 0
Using \sim50k morally-annotated social media posts from a diverse range of topics, we apply a principled four-method validation pipeline: LaBSE cross-lingual embedding similarity, Centered Kernel Alignment (CKA), LLM-as-judge evaluation,…