- Spatio-Temporal Attention Enhanced Multi-Agent DRL for UAV-Assisted Wireless Networks with Limited Communications
Che Chen, Lanhua Li, Shimin Gong, Yu Zhao, Yuming Fang · Mar 23, 2026 · Citations: 0
Simulation Env General
To maximize the overall throughput, we first propose a delay-tolerant multi-agent deep reinforcement learning (MADRL) algorithm that integrates a delay-penalized reward to encourage information sharing among UAVs, while jointly optimizing…
- Self-Correcting VLA: Online Action Refinement via Sparse World Imagination
Chenyv Liu, Wentao Tan, Lei Zhu, Fengling Li, Jingjing Li · Feb 25, 2026 · Citations: 0
Simulation Env Coding
Reinforcement learning enhances physical grounding through exploration yet typically relies on external reward signals that remain isolated from the agent's internal states.
- JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency
Aichen Cai, Anmeng Zhang, Anyu Li, Bo Zhang, Bohua Cai · Apr 3, 2026 · Citations: 0
General
JoyAI-LLM Flash is pretrained on a massive corpus of 20 trillion tokens and further optimized through a rigorous post-training pipeline, including supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and large-scale…
- TriAttention: Efficient Long Reasoning with Trigonometric KV Compression
Weian Mao, Xi Lin, Wei Huang, Yuxin Xie, Tianfu Fu · Apr 6, 2026 · Citations: 0
Automatic Metrics Law
Via the trigonometric series, we use the distance preference characterized by these centers to score keys according to their positions, and also leverage Q/K norms as an additional signal for importance estimation.
- Weakly Supervised Distillation of Hallucination Signals into Transformer Representations
Shoaib Sadiq Salehmohamed, Jinal Prashant Thakkar, Hansika Aredla, Shaik Mohammed Omar, Shalmali Ayachit · Apr 7, 2026 · Citations: 0
Llm As JudgeAutomatic Metrics General
We introduce a weak supervision framework that combines three complementary grounding signals: substring matching, sentence embedding similarity, and an LLM as a judge verdict to label generated responses as grounded or hallucinated without…
- Towards Efficient Agents: A Co-Design of Inference Architecture and System
Weizhe Lin, Hui-Ling Zhen, Shuai Yang, Xian Wang, Renxi Liu · Dec 20, 2025 · Citations: 0
Automatic Metrics General
The rapid development of large language model (LLM)-based agents has unlocked new possibilities for autonomous multi-turn reasoning and tool-augmented decision-making.
- Luna-2: Scalable Single-Token Evaluation with Small Language Models
Vatsal Goel, Rishon Dsouza, Nikhil Ega, Amey Ramesh Rambatla, Rob Friel · Feb 20, 2026 · Citations: 0
Llm As JudgeAutomatic Metrics General
We present Luna-2, a novel architecture that leverages decoder-only small language models (SLMs) into a deterministic evaluation model to reliably compute complex task-specific LLMAJ metrics (e.g.
- Efficient Training-Free Multi-Token Prediction via Embedding-Space Probing
Raghavv Goel, Mukul Gagrani, Mingu Lee, Chris Lott · Mar 18, 2026 · Citations: 0
Automatic Metrics General
Across benchmarks, our probing-based MTP consistently outperforms existing training-free baselines, increasing acceptance length by approximately 12\% on LLaMA3 and 8--12\% on Qwen3, and achieving throughput gains of up to 15--19\%.
- Learning When to Attend: Conditional Memory Access for Long-Context LLMs
Sakshi Choudhary, Aditya Chattopadhyay, Luca Zancato, Elvis Nunez, Matthew Trager · Mar 18, 2026 · Citations: 0
Automatic Metrics General
Based on this, we propose L2A (Learning To Attend), a layer that enables conditional (token-wise) long-range memory access by deciding when to invoke global attention.
- The Headless Firm: How AI Reshapes Enterprise Boundaries
Tassilo Klein, Sebastian Wieczorek · Feb 24, 2026 · Citations: 0
Automatic Metrics General
We argue that agentic AI induces a structural change in how coordination costs scale: in prior modular systems, integration cost grew with interaction topology (O(n^2) in the number of components); in protocol-mediated agentic systems, inte
- Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations
Dongming Jiang, Yi Li, Songtao Wei, Jinxin Yang, Ayushi Kishore · Feb 22, 2026 · Citations: 0
Automatic Metrics General
Agentic memory systems enable large language model (LLM) agents to maintain state across long interactions, supporting long-horizon reasoning and personalization beyond fixed context windows.
- Scalable Neural Decoders for Practical Fault-Tolerant Quantum Computation
Andi Gu, J. Pablo Bonilla Ataides, Mikhail D. Lukin, Susanne F. Yelin · Apr 9, 2026 · Citations: 0
- AsyncTLS: Efficient Generative LLM Inference with Asynchronous Two-level Sparse Attention
Yuxuan Hu, Jianchao Tan, Jiaqi Zhang, Wen Zan, Pingwei Sun · Apr 9, 2026 · Citations: 0
- GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning
Kaiyuan Tian, Yu Tang, Gongqingjian Jiang, Baihui Liu, Yifu Gao · Apr 9, 2026 · Citations: 0
- Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution
Monishwaran Maheswaran, Leon Lakhani, Zhongzhu Zhou, Shijia Yang, Junxiong Wang · Apr 9, 2026 · Citations: 0
- FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
Donghu Kim, Youngdo Lee, Minho Park, Kinam Kim, I Made Aswin Nahendra · Apr 6, 2026 · Citations: 0
- SM-Net: Learning a Continuous Spectral Manifold from Multiple Stellar Libraries
Omar Anwar, Aaron S. G. Robotham, Luca Cortese, Kevin Vinsen · Mar 25, 2026 · Citations: 0
- Learning-guided Prioritized Planning for Lifelong Multi-Agent Path Finding in Warehouse Automation
Han Zheng, Yining Ma, Brandon Araki, Jingkai Chen, Cathy Wu · Mar 25, 2026 · Citations: 0
- SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning
Haoyu Huang, Jinfa Huang, Zhongwei Wan, Xiawu Zheng, Rongrong Ji · Mar 24, 2026 · Citations: 0
- Sparser, Faster, Lighter Transformer Language Models
Edoardo Cetin, Stefano Peluchetti, Emilio Castillo, Akira Naruse, Mana Murakami · Mar 24, 2026 · Citations: 0
- EchoKV: Efficient KV Cache Compression via Similarity-Based Reconstruction
Yixuan Wang, Shiyu Ji, Yijun Liu, Qingfu Zhu, Wanxiang Che · Mar 24, 2026 · Citations: 0
- Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies
Siddhant Kulkarni, Yukta Kulkarni · Mar 24, 2026 · Citations: 0
- Autoregressive vs. Masked Diffusion Language Models: A Controlled Comparison
Caio Vicentino · Mar 23, 2026 · Citations: 0
- A Blueprint for Self-Evolving Coding Agents in Vehicle Aerodynamic Drag Prediction
Jinhui Ren, Huaiming Li, Yabin Liu, Tao Li, Zhaokun Liu · Mar 23, 2026 · Citations: 0
- MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning
Dong Liu, Yanxuan Yu, Ben Lengerich, Ying Nian Wu · Mar 21, 2026 · Citations: 0
- cuGenOpt: A GPU-Accelerated General-Purpose Metaheuristic Framework for Combinatorial Optimization
Yuyang Liu · Mar 19, 2026 · Citations: 0
- Behavioral Fingerprints for LLM Endpoint Stability and Identity
Jonah Leshin, Manish Shah, Ian Timmis, Daniel Kang · Mar 19, 2026 · Citations: 0
- Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR
Quy-Anh Dang, Chris Ngo · Mar 17, 2026 · Citations: 0
- SRAM-Based Compute-in-Memory Accelerator for Linear-decay Spiking Neural Networks
Hongyang Shang, Shuai Dong, Yahan Yang, Junyi Yang, Peng Zhou · Mar 13, 2026 · Citations: 0
- Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity
Donglin Yu · Mar 13, 2026 · Citations: 0
- FastDSAC: Unlocking the Potential of Maximum Entropy RL in High-Dimensional Humanoid Control
Jun Xue, Junze Wang, Xinming Zhang, Shanze Wang, Yanjun Chen · Mar 13, 2026 · Citations: 0
- Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials
Abhinaba Basu, Pavan Chakraborty · Mar 12, 2026 · Citations: 0
- Automatic Generation of High-Performance RL Environments
Seth Karten, Rahul Dev Appapogu, Chi Jin · Mar 12, 2026 · Citations: 0
- Slow-Fast Inference: Training-Free Inference Acceleration via Within-Sentence Support Stability
Xingyu Xie, Zhaochen Yu, Yue Liao, Tao Wang, Kim-Chuan Toh · Mar 12, 2026 · Citations: 0
- GLM-OCR Technical Report
Shuaiqi Duan, Yadong Xue, Weihan Wang, Zhe Su, Huan Liu · Mar 11, 2026 · Citations: 0
- A Platform-Agnostic Multimodal Digital Human Modelling Framework: Neurophysiological Sensing in Game-Based Interaction
Daniel J. Buxton, Mufti Mahmud, Jordan J. Bird, Thomas Hughes-Roberts, David J. Brown · Mar 11, 2026 · Citations: 0
- FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling
Ted Zadouri, Markus Hoehnerbach, Jay Shah, Timmy Liu, Vijay Thakkar · Mar 5, 2026 · Citations: 0
- Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding
Ofir Ben Shoham · Mar 5, 2026 · Citations: 0
- VietNormalizer: An Open-Source, Dependency-Free Python Library for Vietnamese Text Normalization in TTS and NLP Applications
Hung Vu Nguyen, Loan Do, Thanh Ngoc Nguyen, Ushik Shrestha Khwakhali, Thanh Pham · Mar 4, 2026 · Citations: 0
- Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving
Ferran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu · Feb 27, 2026 · Citations: 0
- IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation
Yanpei Guo, Wenjie Qu, Linyu Wu, Shengfang Zhai, Lionel Z. Wang · Feb 26, 2026 · Citations: 0
- Predicting LLM Output Length via Entropy-Guided Representations
Huanyi Xie, Yubin Chen, Liangyu Wang, Lijie Hu, Di Wang · Feb 12, 2026 · Citations: 0
- Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning
Zhicheng Yang, Zhijiang Guo, Yinya Huang, Yongxin Wang, Wenlei Shi · Feb 3, 2026 · Citations: 0
- Toward Closed-loop Molecular Discovery via Language Model, Property Alignment and Strategic Search
Junkai Ji, Zhangfan Yang, Dong Xu, Ruibin Bai, Jianqiang Li · Dec 10, 2025 · Citations: 0
- ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators
Guoqiang Zou, Wanyu Wang, Hao Zheng, Longxiang Yin, Yinhe Han · Dec 10, 2025 · Citations: 0
- Dripper: Token-Efficient Main HTML Extraction with a Lightweight LM
Mengjie Liu, Jiahui Peng, Wenchang Ning, Pei Chu, Jiantao Qiu · Nov 28, 2025 · Citations: 0
- Periodic Asynchrony: An On-Policy Approach for Accelerating LLM Reinforcement Learning
Jian Lu · Nov 24, 2025 · Citations: 0
- Unicorn: A Universal and Collaborative Reinforcement Learning Approach Towards Generalizable Network-Wide Traffic Signal Control
Yifeng Zhang, Yilin Liu, Ping Gong, Peizhuo Li, Mingfeng Fan · Mar 14, 2025 · Citations: 0