Math Reasoning Evaluator (BS/MS/PhD Required)

Join a remote, paid contract role evaluating AI math answers at $80/hr—ideal for candidates with a BS/MS/PhD (or in-progress) in mathematics from a top-100 university. Minimum 17–20 hrs/week with paid qualification and project exams during onboarding.

Generative AI & RLHF

100% Remote Hourly · $80/hr

$80/hr

Compensation

Worldwide

Eligibility

Entry

Experience

Oct 24, 2025

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. Contributors on OpenTrain work remotely to shape how state-of-the-art AI systems behave and can build a profile of domain expertise across projects.

Creating an OpenTrain account is free and gives you access to projects, paid onboarding exams, and the ability to apply for roles like this one that require specialized subject-matter expertise.

About AI training and this opportunity

AI training (also called data labeling, annotation, or human feedback) is the human work that teaches models to reason, compute, and explain. This role focuses on the mathematically specialized side of that work—evaluating and improving how models solve proofs, derivations, and quantitative problems.

You will directly influence model behavior by providing high-quality judgments, exemplar solutions, and consistent rubric-based ratings used to refine model outputs.

The role

We are hiring mathematically specialized evaluators to review AI-generated math responses. This is a remote, contract, part-time role paid at $80 per hour.

Expect a minimum commitment of 17–20 hours per week (typical sprints may ask for ~8 hours/day during active periods). Onboarding includes a paid 1–2 hour qualification exam and a paid 1–2 hour project exam.

Pay: $80 USD per hour, paid as a contractor
Employment type: Contractor, Part-time
Time requirement: Minimum 17–20 hrs/week; typical sprint cadence ~8 hrs/day
Onboarding: Paid qualification and project exams (1–2 hrs each)

What you'll do day to day

You will read and evaluate AI-generated mathematical solutions, applying detailed rubrics to judge correctness, reasoning depth, and clarity. Work is text-based and focuses on reasoning, proofs, calculations, and quantitative fact-checking.

Judge correctness and logical soundness of proofs, derivations, and calculations
Identify subtle conceptual, methodological, and computational errors
Author exemplar solutions that show rigorous step-by-step methods
Compare and rate multiple responses using detailed rubrics
Fact-check quantitative claims and cite reputable public sources when needed
Maintain internal consistency and apply grading rubrics precisely

Requirements

This role is for mathematically specialized candidates; every qualification below is required unless noted as preferred or bonus.

BS, MS, or PhD (or currently pursuing) in Mathematics, Applied Math, or Mathematical Statistics from a top-100 university (required)
Mastery across core areas such as algebra, calculus, probability, and statistics; comfort with proofs and formal notation (required)
Ability to write rigorous, step-by-step solutions in clear C1+ English (required)
Strong quantitative fact-checking skills and ability to cite reputable public sources (required)
Consistent and detail-oriented application of grading/rating rubrics (required)
Availability at least 17–20 hours/week, with ability to work ~8 hrs/day during active sprints (required)
Preferred: research experience, analytical writing or debate background, or programming literacy (e.g., Python/LaTeX)
Bonus: prior data labeling, RLHF, or AI model evaluation experience

Who should apply

Apply if you have a strong formal mathematics background from a top-100 university and enjoy precise, rigorous mathematical writing and error analysis.

This role is not for general STEM generalists; it is explicitly intended for candidates with deep training in mathematics who can detect subtle errors in reasoning and produce exemplar solutions.

How the process works

If selected you will complete a paid qualification exam (1–2 hrs) and a paid project exam (1–2 hrs) as part of onboarding. Successful completion unlocks access to evaluation tasks and ongoing project work.

Work is remote and text-based using the project's labeling/evaluation interface (labeled as OTHER in the project). You will be engaged as a contractor and paid hourly at the stated rate for evaluation work and the paid exams.

Onboarding includes two paid assessments (qualification and project exam)
Work is performed remotely using supplied evaluation tools
You will be contracted and paid hourly, with assignments and sprints managed by the project