- ScholarEval: Research Idea Evaluation Grounded in Literature
Hanane Nour Moussa, Patrick Queiroz Da Silva, Daniel Adu-Ampratwum, Alyson East, Zitong Lu · Oct 17, 2025 · Citations: 0
Rubric Rating
As AI tools become increasingly common for research ideation, robust evaluation is critical to ensure the validity and usefulness of generated ideas.
- SentinelNet: Safeguarding Multi-Agent Collaboration Through Credit-Based Dynamic Threat Detection
Yang Feng, Xudong Pan · Oct 17, 2025 · Citations: 0
- In Generative AI We (Dis)Trust? Computational Analysis of Trust and Distrust in Reddit Discussions
Aria Pessianzadeh, Naima Sultana, Hildegarde Van den Bulck, David Gefen, Shahin Jabbari · Oct 17, 2025 · Citations: 0
The rise of generative AI (GenAI) has impacted many aspects of human life.
- PolySkill: Learning Generalizable Skills Through Polymorphic Abstraction
Simon Yu, Gang Li, Weiyan Shi, Peng Qi · Oct 17, 2025 · Citations: 0
- BIOGEN: Evidence-Grounded Multi-Agent Reasoning Framework for Transcriptomic Interpretation in Antimicrobial Resistance
Elias Hossain, Mehrdad Shoeibi, Ivan Garibay, Niloofar Yousefi · Oct 17, 2025 · Citations: 0
Multi Agent
We present BIOGEN, an evidence-grounded multi-agent framework for post hoc interpretation of RNA-seq transcriptional modules.
- HypoSpace: Evaluating LLM Creativity as Set-Valued Hypothesis Generators under Underdetermination
Tingting Chen, Beibei Lin, Zifeng Yuan, Qiran Zou, Hongyu He · Oct 17, 2025 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Language Models are Injective and Hence Invertible
Giorgos Nikolaou, Tommaso Mencattini, Donato Crisostomi, Andrea Santilli, Yannis Panagakis · Oct 17, 2025 · Citations: 0
- OffSim: Offline Simulator for Model-based Offline Inverse Reinforcement Learning
Woo-Jin Ahn, Sang-Ryul Baek, Yong-Jun Lee, Hyun-Duck Choi, Myo-Taeg Lim · Oct 17, 2025 · Citations: 0
- Learning to Answer from Correct Demonstrations
Nirmit Joshi, Gene Li, Siddharth Bhandari, Shiva Prasad Kasiviswanathan, Cong Ma · Oct 17, 2025 · Citations: 0
Demonstrations
We study the problem of learning to generate an answer (or completion) to a question (or prompt), where there could be multiple correct answers, any one of which is acceptable at test time.
- MNO: Multiscale Neural Operator for 3D Computational Fluid Dynamics
Qinxuan Wang, Chuang Wang, Mingyu Zhang, Jingwei Sun, Peipei Yang · Oct 17, 2025 · Citations: 0
We evaluate MNO on diverse benchmarks, covering steady-state and unsteady flow scenarios with up to 300k points.
- When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling
Heecheol Yun, Kwangmin Ki, Junghyun Lee, Eunho Yang · Oct 17, 2025 · Citations: 0
Our experiments on diverse benchmarks, including MATH500 and BBH, demonstrate that SAFE outperforms existing methods in both accuracy and efficiency, with gains achieved even when ensembling fewer than 1% of tokens.
- SAG-Agent: Enabling Long-Horizon Reasoning in Strategy Games via Dynamic Knowledge Graphs
Chenwei Tang, Lin Long, Xinyu Liu, Jingyu Xing, Zizhou Wang · Oct 17, 2025 · Citations: 0