Senior Java Code Reviewer — Remote Contract
Audit annotator evaluations of AI-generated Java code: compile and run snippets in sandboxes, verify correctness, security, and performance, correct ratings, and give concise feedback. Requires 7+ years Java, JUnit/Testcontainers experience, 20+ hrs/week, $25/hr, fully remote.
Coding & Software
$25/hr
Compensation
Worldwide
Eligibility
Intermediate
Experience
Jul 8, 2025
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect skilled contributors with projects that prepare real-world examples and feedback to teach next-generation AI systems.
Working with OpenTrain means joining a fast-growing industry where your technical judgment directly influences how AI models behave. Many contributors work remotely, part-time, and build long-term skills in QA, annotation, and evaluation.
About AI training and code review work
AI training (data labeling/annotation) is the human side of building modern models: people create, verify, and rate examples that models learn from. For code-focused projects, experienced engineers validate generated code for correctness, security, style, and performance.
This role sits at the intersection of software engineering and AI evaluation: your hands-on Java expertise ensures training data is accurate and trustworthy, which directly improves model outputs.
The role
You will audit annotator evaluations of AI-generated Java code. For each task, you will compile and run code in an isolated environment, confirm it satisfies the prompt, verify functionality, check for secure-coding issues and performance concerns, and ensure the annotator’s score aligns with our rubric.
This is a remote, contractor, part-time role requiring 20+ hours per week. Compensation is pay-per-hour at USD 25/hour. Work is worldwide and asynchronous, but strong written English is required for concise feedback within a structured QA workflow.
- Job type: Contractor, Part-time (20+ hours/week)
- Pay: USD 25 per hour (pay-per-hour)
- Work location: Remote, worldwide
- Data type: AI-generated computer code (Java)
What you'll do
Your day-to-day work focuses on verifying annotator judgments and improving label quality. You will run small Java programs safely, reproduce issues, and document findings concisely using our ticketing and QA tools.
- Compile and run Java snippets in sandboxed/container environments to validate correctness and prompt compliance
- Verify functional behavior, unit tests, and integration tests where provided
- Assess security risks (OWASP issues, deserialization, injection, race conditions) and flag critical vulnerabilities
- Evaluate JVM performance characteristics and note regressions or tuning issues
- Correct mis-ratings and apply rubric-based scores consistently
- Provide concise, constructive feedback to annotators and record findings in the QA system
Requirements
You must meet every substantive technical requirement listed below. We rely on accurate, reproducible reviews, so practical experience compiling and running Java code in containers and following a structured QA rubric is essential.
- Experience: 7+ years in professional Java development, QA, or dedicated code-review roles
- Java expertise: Deep knowledge of modern Java (11–21), core APIs, and JVM internals
- Performance: Experience with GC, JIT behavior, profiling, and JVM performance tuning
- Testing & debugging: Advanced use of JUnit 5, TestNG, Testcontainers, Mockito, and coverage tools (e.g., JaCoCo)
- Concurrency: Proven ability to analyze and debug thread-safety, locks, CompletableFuture, and reactive streams (Project Reactor, RxJava)
- Secure coding: Skilled at spotting OWASP Top 10 issues, deserialization attacks, injection, race conditions, and permission flaws
- Build & toolchain: Proficient with Maven/Gradle, Dockerized builds, CI/CD (GitHub Actions, GitLab CI), and code-review platforms
- Proof-of-work validation: Comfortable compiling and running code in sandbox/container environments to verify functionality
- Structured QA: Familiar with rubric-based scoring, checklist-driven reviews, and ticketing/annotation tools (Jira, Asana)
- Communication: Excellent written English (B2+ CEFR) for concise feedback and mentoring
Nice to have
These are optional but helpful. They may make you a stronger candidate for AI-focused evaluation projects.
- Experience with LLM evaluation, RLHF pipelines, or prior AI/ML data-labeling projects
Who should apply
Apply if you enjoy hands-on code validation, can reproduce and explain failures clearly in writing, and want to help shape how AI systems learn to write and reason about code. This role suits senior engineers, QA leads, and experienced code reviewers who prefer flexible, remote work.
- You like concise written communication and mentoring annotators through targeted feedback
- You have a reliable remote work setup and can run containerized builds locally or in sandboxed environments
- You prefer part-time, contract work and can commit to 20+ hours weekly
How it works
After onboarding, you'll receive tasks through our annotation and QA platform with full rubrics and checklists. Each assignment includes the AI-generated code, the prompt, the annotator’s evaluation, and any test cases. You will reproduce results, correct scores when necessary, and submit concise review notes.
We use standard ticketing and collaboration tools; familiarity with GitHub/GitLab PR workflows and Jira/Asana is expected. Reviews must be precise and aligned with the provided quality rubric—consistency is critical because your evaluations train models.
- Onboarding covers rubric, sandboxed execution policies, and submission format
- Tasks are assigned via the platform; work asynchronously and submit reviews per checklist
- Keep feedback concise, actionable, and linked to concrete reproductions or test failures