Math Reasoning Evaluator (BS/MS/PhD Required)
Join a remote, paid contract role evaluating AI math answers at $80/hr—ideal for candidates with a BS/MS/PhD (or in-progress) in mathematics from a top-100 university. Minimum 17–20 hrs/week with paid qualification and project exams during onboarding.
Generative AI & RLHF
$80/hr
Compensation
Worldwide
Eligibility
Entry
Experience
Oct 24, 2025
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. Contributors on OpenTrain work remotely to shape how state-of-the-art AI systems behave and can build a profile of domain expertise across projects.
Creating an OpenTrain account is free and gives you access to projects, paid onboarding exams, and the ability to apply for roles like this one that require specialized subject-matter expertise.
About AI training and this opportunity
AI training (also called data labeling, annotation, or human feedback) is the human work that teaches models to reason, compute, and explain. This role focuses on the mathematically specialized side of that work—evaluating and improving how models solve proofs, derivations, and quantitative problems.
You will directly influence model behavior by providing high-quality judgments, exemplar solutions, and consistent rubric-based ratings used to refine model outputs.
The role
We are hiring mathematically specialized evaluators to review AI-generated math responses. This is a remote, contract, part-time role paid at $80 per hour.
Expect a minimum commitment of 17–20 hours per week (typical sprints may ask for ~8 hours/day during active periods). Onboarding includes a paid 1–2 hour qualification exam and a paid 1–2 hour project exam.
- Pay: $80 USD per hour, paid as a contractor
- Employment type: Contractor, Part-time
- Time requirement: Minimum 17–20 hrs/week; typical sprint cadence ~8 hrs/day
- Onboarding: Paid qualification and project exams (1–2 hrs each)
What you'll do day to day
You will read and evaluate AI-generated mathematical solutions, applying detailed rubrics to judge correctness, reasoning depth, and clarity. Work is text-based and focuses on reasoning, proofs, calculations, and quantitative fact-checking.
- Judge correctness and logical soundness of proofs, derivations, and calculations
- Identify subtle conceptual, methodological, and computational errors
- Author exemplar solutions that show rigorous step-by-step methods
- Compare and rate multiple responses using detailed rubrics
- Fact-check quantitative claims and cite reputable public sources when needed
- Maintain internal consistency and apply grading rubrics precisely
Requirements
This role is for mathematically specialized candidates; every qualification below is required unless noted as preferred or bonus.
- BS, MS, or PhD (or currently pursuing) in Mathematics, Applied Math, or Mathematical Statistics from a top-100 university (required)
- Mastery across core areas such as algebra, calculus, probability, and statistics; comfort with proofs and formal notation (required)
- Ability to write rigorous, step-by-step solutions in clear C1+ English (required)
- Strong quantitative fact-checking skills and ability to cite reputable public sources (required)
- Consistent and detail-oriented application of grading/rating rubrics (required)
- Availability at least 17–20 hours/week, with ability to work ~8 hrs/day during active sprints (required)
- Preferred: research experience, analytical writing or debate background, or programming literacy (e.g., Python/LaTeX)
- Bonus: prior data labeling, RLHF, or AI model evaluation experience
Who should apply
Apply if you have a strong formal mathematics background from a top-100 university and enjoy precise, rigorous mathematical writing and error analysis.
This role is not for general STEM generalists; it is explicitly intended for candidates with deep training in mathematics who can detect subtle errors in reasoning and produce exemplar solutions.
How the process works
If selected you will complete a paid qualification exam (1–2 hrs) and a paid project exam (1–2 hrs) as part of onboarding. Successful completion unlocks access to evaluation tasks and ongoing project work.
Work is remote and text-based using the project's labeling/evaluation interface (labeled as OTHER in the project). You will be engaged as a contractor and paid hourly at the stated rate for evaluation work and the paid exams.
- Onboarding includes two paid assessments (qualification and project exam)
- Work is performed remotely using supplied evaluation tools
- You will be contracted and paid hourly, with assignments and sprints managed by the project