- Spatio-Temporal Attention Enhanced Multi-Agent DRL for UAV-Assisted Wireless Networks with Limited Communications
Che Chen, Lanhua Li, Shimin Gong, Yu Zhao, Yuming Fang · Mar 23, 2026 · Citations: 0
Simulation Env General
To maximize the overall throughput, we first propose a delay-tolerant multi-agent deep reinforcement learning (MADRL) algorithm that integrates a delay-penalized reward to encourage information sharing among UAVs, while jointly optimizing…
- Self-Correcting VLA: Online Action Refinement via Sparse World Imagination
Chenyv Liu, Wentao Tan, Lei Zhu, Fengling Li, Jingjing Li · Feb 25, 2026 · Citations: 0
Simulation Env Coding
Reinforcement learning enhances physical grounding through exploration yet typically relies on external reward signals that remain isolated from the agent's internal states.
- JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency
Aichen Cai, Anmeng Zhang, Anyu Li, Bo Zhang, Bohua Cai · Apr 3, 2026 · Citations: 0
General
JoyAI-LLM Flash is pretrained on a massive corpus of 20 trillion tokens and further optimized through a rigorous post-training pipeline, including supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and large-scale…
- Weakly Supervised Distillation of Hallucination Signals into Transformer Representations
Shoaib Sadiq Salehmohamed, Jinal Prashant Thakkar, Hansika Aredla, Shaik Mohammed Omar, Shalmali Ayachit · Apr 7, 2026 · Citations: 0
Llm As JudgeAutomatic Metrics General
We introduce a weak supervision framework that combines three complementary grounding signals: substring matching, sentence embedding similarity, and an LLM as a judge verdict to label generated responses as grounded or hallucinated without…
- Luna-2: Scalable Single-Token Evaluation with Small Language Models
Vatsal Goel, Rishon Dsouza, Nikhil Ega, Amey Ramesh Rambatla, Rob Friel · Feb 20, 2026 · Citations: 0
Llm As JudgeAutomatic Metrics General
We present Luna-2, a novel architecture that leverages decoder-only small language models (SLMs) into a deterministic evaluation model to reliably compute complex task-specific LLMAJ metrics (e.g.
- The Headless Firm: How AI Reshapes Enterprise Boundaries
Tassilo Klein, Sebastian Wieczorek · Feb 24, 2026 · Citations: 0
Automatic Metrics General
We argue that agentic AI induces a structural change in how coordination costs scale: in prior modular systems, integration cost grew with interaction topology (O(n^2) in the number of components); in protocol-mediated agentic systems, inte
- Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations
Dongming Jiang, Yi Li, Songtao Wei, Jinxin Yang, Ayushi Kishore · Feb 22, 2026 · Citations: 0
Automatic Metrics General
Agentic memory systems enable large language model (LLM) agents to maintain state across long interactions, supporting long-horizon reasoning and personalization beyond fixed context windows.
- Scalable Neural Decoders for Practical Fault-Tolerant Quantum Computation
Andi Gu, J. Pablo Bonilla Ataides, Mikhail D. Lukin, Susanne F. Yelin · Apr 9, 2026 · Citations: 0
- Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution
Monishwaran Maheswaran, Leon Lakhani, Zhongzhu Zhou, Shijia Yang, Junxiong Wang · Apr 9, 2026 · Citations: 0
- SM-Net: Learning a Continuous Spectral Manifold from Multiple Stellar Libraries
Omar Anwar, Aaron S. G. Robotham, Luca Cortese, Kevin Vinsen · Mar 25, 2026 · Citations: 0
- Learning-guided Prioritized Planning for Lifelong Multi-Agent Path Finding in Warehouse Automation
Han Zheng, Yining Ma, Brandon Araki, Jingkai Chen, Cathy Wu · Mar 25, 2026 · Citations: 0
- Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies
Siddhant Kulkarni, Yukta Kulkarni · Mar 24, 2026 · Citations: 0
- A Blueprint for Self-Evolving Coding Agents in Vehicle Aerodynamic Drag Prediction
Jinhui Ren, Huaiming Li, Yabin Liu, Tao Li, Zhaokun Liu · Mar 23, 2026 · Citations: 0
- MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning
Dong Liu, Yanxuan Yu, Ben Lengerich, Ying Nian Wu · Mar 21, 2026 · Citations: 0
- cuGenOpt: A GPU-Accelerated General-Purpose Metaheuristic Framework for Combinatorial Optimization
Yuyang Liu · Mar 19, 2026 · Citations: 0
- Behavioral Fingerprints for LLM Endpoint Stability and Identity
Jonah Leshin, Manish Shah, Ian Timmis, Daniel Kang · Mar 19, 2026 · Citations: 0
- SRAM-Based Compute-in-Memory Accelerator for Linear-decay Spiking Neural Networks
Hongyang Shang, Shuai Dong, Yahan Yang, Junyi Yang, Peng Zhou · Mar 13, 2026 · Citations: 0
- Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity
Donglin Yu · Mar 13, 2026 · Citations: 0
- FastDSAC: Unlocking the Potential of Maximum Entropy RL in High-Dimensional Humanoid Control
Jun Xue, Junze Wang, Xinming Zhang, Shanze Wang, Yanjun Chen · Mar 13, 2026 · Citations: 0
- Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials
Abhinaba Basu, Pavan Chakraborty · Mar 12, 2026 · Citations: 0
- Automatic Generation of High-Performance RL Environments
Seth Karten, Rahul Dev Appapogu, Chi Jin · Mar 12, 2026 · Citations: 0
- Slow-Fast Inference: Training-Free Inference Acceleration via Within-Sentence Support Stability
Xingyu Xie, Zhaochen Yu, Yue Liao, Tao Wang, Kim-Chuan Toh · Mar 12, 2026 · Citations: 0
- A Platform-Agnostic Multimodal Digital Human Modelling Framework: Neurophysiological Sensing in Game-Based Interaction
Daniel J. Buxton, Mufti Mahmud, Jordan J. Bird, Thomas Hughes-Roberts, David J. Brown · Mar 11, 2026 · Citations: 0
- Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding
Ofir Ben Shoham · Mar 5, 2026 · Citations: 0
- Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving
Ferran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu · Feb 27, 2026 · Citations: 0
- IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation
Yanpei Guo, Wenjie Qu, Linyu Wu, Shengfang Zhai, Lionel Z. Wang · Feb 26, 2026 · Citations: 0
- Predicting LLM Output Length via Entropy-Guided Representations
Huanyi Xie, Yubin Chen, Liangyu Wang, Lijie Hu, Di Wang · Feb 12, 2026 · Citations: 0
- Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning
Zhicheng Yang, Zhijiang Guo, Yinya Huang, Yongxin Wang, Wenlei Shi · Feb 3, 2026 · Citations: 0
- Toward Closed-loop Molecular Discovery via Language Model, Property Alignment and Strategic Search
Junkai Ji, Zhangfan Yang, Dong Xu, Ruibin Bai, Jianqiang Li · Dec 10, 2025 · Citations: 0
- ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators
Guoqiang Zou, Wanyu Wang, Hao Zheng, Longxiang Yin, Yinhe Han · Dec 10, 2025 · Citations: 0
- Periodic Asynchrony: An On-Policy Approach for Accelerating LLM Reinforcement Learning
Jian Lu · Nov 24, 2025 · Citations: 0
- Unicorn: A Universal and Collaborative Reinforcement Learning Approach Towards Generalizable Network-Wide Traffic Signal Control
Yifeng Zhang, Yilin Liu, Ping Gong, Peizhuo Li, Mingfeng Fan · Mar 14, 2025 · Citations: 0