- GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression
Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang · Dec 31, 2024 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Evaluating LLMs' Divergent Thinking Capabilities for Scientific Idea Generation with Minimal Context
Kai Ruan, Xuan Wang, Jixiang Hong, Peng Wang, Yang Liu · Dec 23, 2024 · Citations: 0
While Large Language Models (LLMs) demonstrate remarkable capabilities in scientific tasks such as literature analysis and experimental design (e.g., accurately extracting key findings from papers or generating coherent experimental…
- A Survey of Query Optimization in Large Language Models
Mingyang Song, Mao Zheng · Dec 23, 2024 · Citations: 0
We further examine evaluation methodologies, identify critical gaps in existing benchmarks, and discuss open challenges including process reward models, efficiency optimization, and multi-modal query handling.
- LLM4AD: A Platform for Algorithm Design with Large Language Model
Fei Liu, Rui Zhang, Zhuoliang Xie, Rui Sun, Kai Li · Dec 23, 2024 · Citations: 0
We have also designed a unified evaluation sandbox to ensure a secure and robust assessment of algorithms.
- Multi-modal, Multi-task, Multi-criteria Automatic Evaluation with Vision Language Models
Masanari Ohi, Masahiro Kaneko, Naoaki Okazaki, Nakamasa Inoue · Dec 19, 2024 · Citations: 0
However, existing metrics for evaluating the quality of text generated by VLMs typically focus on an overall evaluation for a specific task, such as image captioning.
- LMUnit: Fine-grained Evaluation with Natural Language Unit Tests
Jon Saad-Falcon, Rajan Vivek, William Berrios, Nandita Shankar Naik, Matija Franklin · Dec 17, 2024 · Citations: 0
Pairwise Preference
We introduce natural language unit tests, a paradigm that decomposes response quality into explicit, testable criteria, along with a unified scoring model, LMUnit, which combines multi-objective training across preferences, direct ratings,…
- Efficient Continual Learning for Small Language Models with a Discrete Key-Value Bottleneck
Andor Diera, Lukas Galke, Fabian Karl, Ansgar Scherp · Dec 11, 2024 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- SpecFuse: Ensembling Large Language Models via Next-Segment Prediction
Bo Lv, Nayu Liu, Chen Tang, Xin Liu, Yue Yu · Dec 10, 2024 · Citations: 0
Experimental results on five LLM families (ranging from 7B to 72B parameters) and six benchmark datasets, spanning open-domain instruction following, reasoning, commonsense, demonstrate consistent performance improvements compared to…
- Speaker effects in language comprehension: An integrative model of language and speaker processing
Hanlin Wu, Zhenguang G. Cai · Dec 10, 2024 · Citations: 0
We discuss how speaker effects serve as indices for assessing language development and social cognition, and we encourage future research to extend these findings to the emerging domain of artificial intelligence (AI) speakers, as AI agents…
- Predicting Subway Passenger Flows under Incident Situation with Causality
Xiannan Huang, Shuhan Qiu, Quan Yuan, Chao Yang · Dec 9, 2024 · Citations: 0
Abstract shows limited direct human-feedback or evaluation-protocol detail; use as adjacent methodological context.
- Efficient Context Propagating Perceiver Architectures for Auto-Regressive Language Modeling
Kaleel Mahmood, Shaoyi Huang · Dec 8, 2024 · Citations: 0
Pairwise Preference
To this end, we develop four new architectural paradigms, the best performing of which we denote as the Efficient Context propagating Perceiver (ECP).
- A Contemporary Overview: Trends and Applications of Large Language Models on Mobile Devices
Lianjun Liu, Hongli An, Pengxuan Chen, Longxiang Ye · Dec 4, 2024 · Citations: 0