Tag: Coding

Involves software engineering or code-quality expertise.

Papers in tag: 314

Research Utility Snapshot

Evaluation Modes

Human Feedback Types

Required Expertise

Zuhong Lin, Daoyuan Ren, Kai Ran, Jing Sun, Songlin Yu, Xuefeng Bai · Apr 26, 2025 · Citations: 0

Automatic Metrics Coding

Haohao Qu, Shanru Lin, Yujuan Ding, Yiqi Wang, Wenqi Fan · Apr 16, 2025 · Citations: 0

Pairwise Preference Automatic Metrics Coding

Specifically, ContRec consists of two key modules: a sigma-VAE Tokenizer, which encodes users/items with continuous tokens; and a Dispersive Diffusion module, which captures implicit user preference.
By conditioning on the previously generated tokens of the LLM backbone during user modeling, the Dispersive Diffusion module performs a conditional diffusion process with a novel Dispersive Loss, enabling high-quality user preference genera

Julian Minder, Clément Dumas, Caden Juang, Bilal Chugtai, Neel Nanda · Apr 3, 2025 · Citations: 0

Pairwise Preference Automatic Metrics Coding

Using the BatchTopK crosscoder, we successfully identify a set of chat-specific latents that are both interpretable and causally effective, representing concepts such as $\textit{false information}$ and $\textit{personal question}$, along w

Max Lamparth, Declan Grabb, Amy Franks, Scott Gershan, Kaitlyn N. Kunstman, Aaron Lulla · Feb 22, 2025 · Citations: 0

Pairwise PreferenceExpert Verification Automatic Metrics MedicineCoding

Current medical language model (LM) benchmarks often over-simplify the complexities of day-to-day clinical practice tasks and instead rely on evaluating LMs on multiple-choice board exam questions.
This design enables systematic evaluations of model performance and bias by studying how demographic factors affect decision-making.

Jonathan Laurent, André Platzer · Feb 7, 2025 · Citations: 0

Demonstrations Automatic Metrics Coding