AI Code QA Developer — Python, JS/TS, Rust, or Ruby
Apply specifying which primary language you’re applying for (Python, JavaScript/TypeScript, Rust, or Ruby). Contract role: review model-generated code, annotate results, and write evaluation prompts — 20+ hrs/week at $15/hr.
Coding & Software
$15/hr
Compensation
Worldwide
Eligibility
Intermediate
Experience
Aug 1, 2025
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect people who want flexible, remote work with projects that teach and refine today’s AI systems — discover projects, build a profile, and apply in minutes.
About AI training and why it matters
AI training (also called data labeling or human feedback) is the human side of building intelligent systems: people annotate, review, and evaluate examples that models learn from. This work is accessible, often part-time and remote, and puts you on the front lines of how state-of-the-art models behave.
The role
You’ll be paired with one primary language—Python, JavaScript/TypeScript, Rust, or Ruby—and tasked with reviewing code generated by large language models, creating evaluation prompts, and providing structured, actionable feedback to improve model output quality.
- Commitment: 20+ hours per week (contract, part-time).
- Pay: Paid per hour at USD 15/hour.
- Open worldwide; work as a contractor.
What you’ll do
- Review and evaluate model-generated code for correctness, style, security issues, and testability.
- Annotate outputs with structured feedback and classifications that guide model improvements.
- Write and refine evaluation prompts and test cases to probe model behavior.
- Identify bugs, edge cases, and maintainability concerns, and explain them clearly in writing.
- Follow detailed guidelines and switch between tasks rapidly while maintaining quality.
Requirements — core criteria
- Begin your application by stating: which primary language are you applying for — Python, JavaScript/TypeScript, Rust, or Ruby?
- Bachelor’s degree in Computer Science or Software Engineering (or equivalent experience).
- At least 3 years of professional development experience in the stated language.
- Demonstrated code-review experience and familiarity with CI/CD workflows (e.g., GitHub Actions, Jenkins, etc.).
- Advanced written English with the ability to produce clear, structured feedback.
- Comfortable working with detailed guidelines and rapid task-switching.
Language-specific must-haves
- Python: Fluency with pytest or unittest and automation in CI pipelines; experience critiquing AI-generated Python is a plus.
- JavaScript / TypeScript: Hands-on experience with Jest, Mocha, Cypress or Playwright; browser automation and bug-tracking proficiency.
- Rust: Strong experience with cargo test, Clippy, rustfmt, and module/crate management; familiarity with property-based testing (quickcheck) is valued.
- Ruby: Experience with RSpec, Minitest, Capybara and automation pipelines; prior AI/LLM-assisted QA in Ruby projects is a plus.
Bonus signals
- Participation in hackathons or competitive-coding events.
- Previous work on AI training, LLM fine-tuning, or code-evaluation projects.
How it works — day-to-day and tools
You will receive tasks through the project’s workflow and use provided interfaces or tooling to inspect code, run quick checks, and submit annotations. Deliverables are structured feedback records and labeled examples that help engineers and model teams iterate.
- Work remotely, submit annotations and feedback through the project platform or tooling.
- Expect clear guidelines, examples, and quality checks to align your judgments with project standards.
Application & interviewer checklist
When applying or interviewing, be prepared to confirm your language focus and walk through relevant code-review examples. The interviewer will verify core criteria and ask about your CI/testing experience and prior AI work where applicable.
- Confirm language focus: Python, JavaScript/TypeScript, Rust, or Ruby.
- Interviewer will check: degree or equivalent, ≥3 years experience, CI/testing familiarity, written English, and ability to follow complex guidelines.