Biology Reasoning Evaluator (PhD)

Remote contractor role for PhD-level biologists to evaluate AI-generated biology responses: assess correctness, reasoning, methods, and statistics; $80/hr with paid qualification and project exams. Minimum availability ~17–20 hrs/week, worldwide.

Generative AI & RLHF

100% Remote Hourly · $80/hr

$80/hr

Compensation

Worldwide

Eligibility

Entry

Experience

Oct 24, 2025

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. The platform connects expert contributors with projects that shape how modern AI systems learn from human examples.

Creating an OpenTrain account is free; contributors use it to discover projects, build a profile, and apply in minutes. This role is managed through OpenTrain and is part of the growing field of human-in-the-loop AI training.

About AI training work

AI training (data labeling / annotation / RLHF) is the human side of building AI: people provide and evaluate examples that teach models to reason, speak, and follow instructions. The work is often 100% remote, flexible, and accessible to experts and non-experts alike.

As an evaluator you will directly influence how generative models handle scientific reasoning and methods, a uniquely impactful way to shape technology used across research and industry.

The role

We are hiring a contractor to review AI-generated biology responses and evaluate biological reasoning, methods, and interpretation. You will rate model outputs against detailed rubrics, identify conceptual and methodological flaws, fact-check claims, and draft exemplar explanations or model solutions.

This is a part-time contractor role, open worldwide. Onboarding includes paid short qualification and project exams.

Work type: Contractor, part-time
Data type: Text; Label type: Evaluation/Rating
(platform-provided tools)
Worldwide remote contributors accepted

What you'll do

Review AI-generated biology responses for factual correctness, conceptual clarity, and methodological soundness.
Assess reasoning depth and clarity; identify errors in study design, methodology, statistics, calculations, and interpretation.
Fact-check claims using reputable public sources and provide precise references when required.
Draft exemplar explanations, model answers, or step-by-step corrections to guide model behavior.
Rate and compare multiple responses using detailed evaluation rubrics and provide consistent, reproducible judgments.

Requirements

PhD in Biology or a closely related life science (required); degree from a Top-100 university preferred.
Peer-reviewed publication record as first or co-author.
Proven experience creating or critically reviewing complex biological content (protocols, curricula, grants, computational analyses).
Breadth across core areas such as molecular/cell biology, genetics/genomics, biochemistry, and physiology.
Strong experimental design and statistics literacy; ability to spot methodological and analytical flaws.
Exceptional scientific writing in English at C1+ level: clear, rigorous, step-by-step reasoning and correct terminology.
Meticulous attention to detail, reproducibility, and consistent application of evaluation rubrics.
Availability: minimum 17–20 hours/week; preferred cadence ~8 hours/day during active sprints.

Nice-to-have

Previous experience with data labeling, RLHF, or AI model evaluation is a plus but not required.

Compensation, schedule, and how to apply

Pay: $80 USD per hour. This role is classified under 'Less than 20 hours/week' time commitment while requiring a minimum availability of 17–20 hrs/week during engagement.

Onboarding includes a paid 1–2 hour qualification exam and a paid 1–2 hour project exam to confirm fit and rubric alignment.

You will work as a contractor with flexible scheduling around project sprints. Contributors must reliably apply rubrics and meet quality standards during reviews.

Hourly rate: $80 USD/hr
Time classification: Less than 20 hours/week; minimum availability expected 17–20 hrs/week
Paid onboarding: 1–2 hr qualification exam and 1–2 hr project exam