Machine Learning Expert — Python, GenAI, SQL

Design and validate computational STEM/ML problems for generative-AI training, writing reproducible Python solutions and clear documentation. Contract, part-time project work (~10–20 hrs/week), US-restricted contributors preferred; pay $15–$40/hr.

Generative AI & RLHF

100% Remote Hourly · $15–$40/hr

$15–$40/hr

Compensation

Worldwide

Eligibility

Expert

Experience

Apr 5, 2026

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect contributors with projects where people teach and shape how modern AI systems behave.

Contributors use their domain skills to create, review, and verify the examples that state-of-the-art models learn from — an excellent way to join the cutting edge of AI while working remotely and flexibly.

About AI training work

AI training (data labeling/annotation/human feedback) is the human side of building intelligent systems: people create, evaluate, and refine the examples models learn from. Projects range from writing prompts and responses to verifying complex programmatic solutions.

This role focuses on generative-AI and problem-design tasks: you’ll author computationally intensive text problems and validate solutions that train and evaluate models.

The role

We’re hiring an expert Machine Learning contributor to design and verify computational STEM/ML problems used to train and evaluate generative models and related pipelines.

This is a contract, project-based role (part-time); contributors are expected to deliver high-quality problem statements, reproducible Python solutions, and clear documentation.

Project type: Prompt + Response (SFT), Text generation, QA, Evaluation/Rating, and Computer programming/coding.
Data type: Text — you will author problem statements, solution descriptions, and code snippets.
/ proprietary tooling provided by the project.

What you’ll do

Create original computational STEM and ML problems that reflect real scientific workflows and require non-trivial reasoning and programming to solve.

Validate every problem by implementing reproducible Python solutions and confirming outputs with standard data-science libraries.

Design problems that are computationally intensive (not solvable quickly by hand) and include clear success criteria.
Write and run reproducible Python code using NumPy, Pandas, SciPy, and scikit-learn (and optionally statsmodels) to verify answers.
Provide clear, well-structured problem statements, step-by-step solution notes, and any test inputs/outputs needed for evaluation.
Use SQL for any required dataset setup or verification (joins, aggregations, window functions).
Apply GenAI techniques (LLMs, RAG, prompt engineering, vector DBs) where appropriate to craft prompts and reference solutions.
Document workflows for reproducibility and potential deployment (packaging, monitoring basics).

Requirements

You must meet the technical and availability requirements below to be considered.

5+ years hands-on machine learning experience with demonstrated business impact.
Expert Python for data science: NumPy, Pandas, SciPy, scikit-learn (statsmodels is a plus).
Strong statistical analysis skills and deep understanding of ML algorithms and trade-offs.
Proven ability to design original computational problems with clear solution paths.
Experience verifying/validating solutions with reproducible Python code and correct outputs.
Strong SQL skills (joins, aggregations, window functions) and database data manipulation.
Experience with GenAI approaches (LLMs, retrieval-augmented generation, prompt engineering, vector databases).
Familiarity with MLOps and deployment basics (packaging, reproducibility, monitoring).
Experience with at least one modern ML framework (TensorFlow or PyTorch; LangChain is a bonus).
Written English proficiency at C1+ level (comfortable producing clear documentation).
Availability to contribute approximately 10–20 hours per week during active project phases.

Who should apply

This role is aimed at senior ML practitioners who enjoy translating real-world problems into rigorous, verifiable computational tasks for model training and evaluation.

If you like writing precise problem statements, producing reproducible code, and shaping how generative models are trained, you’ll fit well.

Ideal for ML engineers, research scientists, or senior data scientists with practical production experience.
Good fit if you’re experienced with Python data-science tooling, SQL, and modern ML frameworks.

Compensation, location, and how it works

This is a contract, part-time engagement paid hourly. Compensation range: USD $15–$40 per hour (pay per hour; highest rates for top-qualified contributors).

Project-based work — not a permanent position — with flexible scheduling during active phases. You’ll interact with project managers and use the project’s labeling/tooling environment.

Time commitment: ~10–20 hours/week during active project phases.
Location: Project restricts contributors to the United States — please confirm eligibility before applying.
Pre-labeled data: No — you will author problems and produce validated solutions from scratch.
To apply: be prepared to demonstrate past ML projects, share code examples, and complete a short screening task.