Skip to content
← Back to explorer

Tag: Medicine

Involves clinical or medical expertise in annotation or evaluation.

Papers in tag: 78

Research Utility Snapshot

Evaluation Modes

  • Automatic Metrics (4)
  • Simulation Env (1)

Human Feedback Types

  • Expert Verification (2)
  • Pairwise Preference (1)

Required Expertise

  • Medicine (4)
  • Coding (1)
MedPlan: A Two-Stage RAG-Based System for Personalized Medical Plan Generation

Hsin-Ling Hsu, Cong-Tinh Dao, Luning Wang, Zitao Shuai, Thao Nguyen Minh Phan, Jun-En Ding · Mar 23, 2025 · Citations: 0

Expert Verification Automatic Metrics Medicine
  • Comprehensive evaluation demonstrates that our method significantly outperforms baseline approaches in both assessment accuracy and treatment plan quality.
Can Multimodal LLMs Perform Time Series Anomaly Detection?

Xiongxiao Xu, Haoran Wang, Yueqing Liang, Philip S. Yu, Yue Zhao, Kai Shu · Feb 25, 2025 · Citations: 0

Automatic Metrics Medicine
  • One natural way for humans to detect time series anomalies is through visualization and textual description.
  • To address the gap, we build a VisualTimeAnomaly benchmark to comprehensively investigate zero-shot capabilities of MLLMs for TSAD, progressively from point-, range-, to variate-wise anomalies, and extends to irregular sampling conditions.
Moving Beyond Medical Exams: A Clinician-Annotated Fairness Dataset of Real-World Tasks and Ambiguity in Mental Healthcare

Max Lamparth, Declan Grabb, Amy Franks, Scott Gershan, Kaitlyn N. Kunstman, Aaron Lulla · Feb 22, 2025 · Citations: 0

Pairwise PreferenceExpert Verification Automatic Metrics MedicineCoding
  • Current medical language model (LM) benchmarks often over-simplify the complexities of day-to-day clinical practice tasks and instead rely on evaluating LMs on multiple-choice board exam questions.
  • This design enables systematic evaluations of model performance and bias by studying how demographic factors affect decision-making.
Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations

Zijie Liu, Xinyu Zhao, Jie Peng, Zhuangdi Zhu, Qingyu Chen, Kaidi Xu · Jan 29, 2025 · Citations: 0

Automatic MetricsSimulation Env Medicine
  • These tuning methods and benchmarks overlook critical aspects like evidence-based reasoning and handling distracting information.
  • To bridge this gap, we introduce a novel benchmark that simulates real-world diagnostic scenarios, integrating noise and difficulty levels aligned with USMLE standards.