Computer Science Expert — Python Validation & AI Evaluation
Join OpenTrain to design rigorous computer-science problems and evaluate AI-generated solutions using Python; contract, part-time work paying $15–$40/hr and typically requiring ~10–20 hours/week during active project phases. Must have CS experience, strong Python, and fluent English.
Generative AI & RLHF
$15–$40/hr
Compensation
Worldwide
Eligibility
Expert
Experience
Apr 5, 2026
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for discovering and building careers in AI training and data labeling. Contributors use the platform to find project-based work, build a visible profile, and apply quickly for roles that help shape how AI systems behave.
We focus on high-quality, remote roles across the AI-training ecosystem — from annotation and transcription to advanced evaluation and RLHF — and support contributors at every skill level.
About AI training work
AI training (data labeling, annotation, and human feedback) is the human foundation of modern AI: people create, evaluate, and refine examples that models learn from. This role sits at the intersection of computer science and RLHF-style evaluation, directly influencing model correctness and reasoning.
Work is largely remote and project-based, making it an excellent way to contribute to cutting-edge systems with flexible hours.
The Role
We’re hiring Computer Science experts to design problems and validate AI-generated solutions with hands-on Python validation. This is a contract, part-time role (project-based) that typically requires ~10–20 hours/week during active phases.
Compensation is paid hourly: $15–$40 USD per hour. This role is focused on evaluating multi-step reasoning, scoring responses with structured criteria, improving AI explanations, and performing numerical validation or simulations using Python (NumPy, Pandas, SciPy).
- Employment type: Contractor, part-time, project-based
- Pay: $15–$40 USD per hour
- Expected availability: ~10–20 hours/week during active project phases
- Location: Applicants must meet location restrictions listed in the posting (see Requirements)
What you'll do
Contribute expert-level problem design and rigorous evaluation to improve AI reasoning and correctness.
- Design computer-science problems aligned with professional practice and real-world constraints.
- Evaluate AI-generated solutions for correctness, assumptions, constraints, edge cases, and logical soundness.
- Validate calculations, algorithms, or simulations with Python and scientific libraries (NumPy, Pandas, SciPy or equivalents).
- Score multi-step reasoning using structured rubrics and provide clear technical feedback to improve outputs.
- Write and refine prompt + response examples (SFT-style), participate in red-teaming, and generate high-quality reference answers.
- Collaborate with a global community of experts to maintain consistent evaluation standards and scientific integrity.
Requirements
You must meet the following minimum qualifications to be considered.
- Degree in Computer Science or a closely related field, or equivalent demonstrated expertise.
- At least 2+ years of applied, research, or teaching experience in computer science or adjacent technical disciplines.
- Strong Python proficiency for numerical validation/simulation (experience with NumPy, Pandas, SciPy or equivalent libraries).
- Proven ability to evaluate multi-step reasoning, identify assumptions, and assess edge cases.
- Experience with structured evaluation/scoring of complex work and clear technical writing (English at C1+ level).
- Availability to contribute approximately 10–20 hours/week during active project phases (project-based).
- Location requirement: candidates must meet the posting’s location restrictions (see additional requirements).
Preferred qualifications
These skills will help you stand out but are not strictly required unless listed above.
- Familiarity with additional languages or tooling such as MATLAB, R, C/C++, SQL, or domain-specific libraries.
- Professional certifications or applied international project experience (e.g., SAS, CAP) are a plus.
- Prior experience creating evaluation rubrics or contributing to RLHF/prompt engineering projects.
How it works and how to apply
Create a free OpenTrain account, complete your profile, and submit your application with a resume or evidence of relevant experience. You may be asked to demonstrate Python proficiency and provide examples of past work or problem designs.
Selected candidates are invited to participate in project onboarding and will receive assignment details, guidelines, and the proprietary labeling/evaluation interface (). Work is remote and project-based; compensation and hours areproject-specific and paid per hour as stated.
- Create an OpenTrain account and complete your profile.
- Attach your resume and examples of CS problem design or code validation (if available).
- Be prepared to demonstrate Python skills and explain evaluation experience during screening.