Machine Learning Expert (Python, GenAI, SQL)
OpenTrain AI · Remote · Worldwide · Posted Apr 5, 2026
**Job Name:** Machine Learning Expert (Python, GenAI, SQL)
**Dataset Description (5–8 words):** Computational STEM problem creation and verification
**Data Type (select one):** Text
**Subject Matter/Industry (5–8 words):** Machine learning, statistics, computational STEM problems
**Pre-labeled Data (Yes/No):** No
**Labeling Software:** Other
**Label Types (select 1+):**
* Prompt + Response Writing (SFT)
* Text Generation
* Question Answering
* Evaluation/Rating
* Computer Programming/Coding
---
## Labeling Overview
**Qualifications / requirements:**
We’re looking for experienced machine learning specialists with expert Python skills who can design and verify computational STEM/ML problems. Ideal contributors have 5+ years of hands-on ML experience, strong statistics/ML foundations, and strong written English (C1+). You should be comfortable writing Python-based solutions and validating results using standard data science libraries.
**What you’ll do:**
You will create original computational STEM/ML problems that reflect real scientific workflows, including problems that require Python programming and non-trivial reasoning to solve. You’ll ensure tasks are computationally intensive (not solvable manually in a reasonable timeframe), verify solutions with Python (e.g., NumPy, Pandas, SciPy, scikit-learn), and clearly document problem statements with correct, validated answers.
---
**Required Locations:** Global - Any Location
**Required English Level:** Fluent
---
## Other Qualifications & Requirements (for screening)
* 5+ years of hands-on machine learning experience with demonstrated business impact
* Expert Python for data science (NumPy, Pandas, SciPy, scikit-learn; bonus: statsmodels)
* Strong ability to design original computational problems (STEM/ML) with clear solution paths
* Proven experience verifying/validating solutions using Python (reproducible code + correct outputs)
* Expert statistical analysis and strong understanding of ML algorithms and practical trade-offs
* Strong SQL skills (joins, aggregations, window functions) and database data manipulation
* Experience with GenAI tools/approaches (LLMs, RAG, prompt engineering, vector databases)
* Familiarity with MLOps and deployment workflows (e.g., packaging, reproducibility, monitoring basics)
* Experience with at least one modern framework (TensorFlow, PyTorch; bonus: LangChain)
* Written English proficiency at C1+ level (or equivalent), comfortable writing clear documentation
* Availability to contribute ~10–20 hours/week during active project phases (project-based; not permanent)