Skip to content

BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology

2025-03-01

Full analysis loading… Code implementations, benchmark data, and reproduction guides are being assembled. Please check back shortly.

Browse all papers

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.