Python Backend Developer For AI Model Testing

Join a remote, part-time contract to build and test backend services while running focused AI model testing bursts; less than 20 hours/week with competitive pay up to $90/hr. Help shape next-generation coding assistants with hands-on feedback and incident reports.

Coding Software

100% Remote Hourly · $30–$90/hr

$30–$90/hr

Compensation

Worldwide

Eligibility

Intermediate

Experience

Jun 28, 2026

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We help people start and grow freelance careers teaching AI by consolidating projects, tracking work history, and building a portfolio contributors control.

We connect skilled contributors to real projects that train and evaluate modern AI systems, enabling flexible remote work that fits around life and other commitments.

About AI Training Work

AI training (data labeling, annotation, and human feedback) is the human side of building AI: people prepare, evaluate, and improve examples that models learn from. Work in this space ranges from annotating images and transcribing audio to rating model responses and testing developer tools.

This role focuses on developer-facing model evaluation: you will exercise code-focused AI models, report on their behavior, and help improve the developer experience for next-generation coding assistants.

The Role

We are recruiting a Python backend developer with strong backend experience to join a project that tests and improves AI coding models. Your work mixes hands-on backend development with structured model testing and reporting.

This is a remote, part-time contractor role that asks for deep backend knowledge plus the ability to run intensive, short testing bursts and communicate findings to a research team.

What You’ll Do

You will develop and maintain backend services while actively evaluating AI models in real coding workflows. The role balances implementation work with disciplined testing, reporting, and collaboration with researchers.

Design, develop, and optimize REST and GraphQL endpoints for scalable APIs.
Drive data validation, error handling, and security best practices in backend services.
Plan and execute database migrations, performance tuning, and schema changes.
Actively test new AI-powered models in Cursor and similar tools, performing 4-day intensive testing bursts.
Produce incident reports, bug traces, screenshots, and detailed post-burst surveys on model performance.
Engage in a dedicated Slack channel with the research team to discuss findings and propose improvements.

Requirements

Candidates must meet the role’s core technical and communication expectations. Every requirement below is essential to succeed in this project.

3+ years professional experience as a backend developer with strong expertise in Python.
Proficient building and maintaining RESTful and GraphQL APIs.
Advanced knowledge of backend data validation, error handling, and API security.
Hands-on experience with database migrations, performance tuning, and schema design.
Extensive use of AI coding tools; familiarity with Cursor is highly desirable.
Outstanding written and verbal communication skills and clear incident/bug reporting.
Ability to thrive in fast-paced, confidential, and collaborative remote environments.

Helpful Background

The following are not strictly required but will strengthen your application and impact on the project.

Visible open-source contributions (GitHub history, stars, or notable projects).
Experience designing or evaluating experimental tooling and developer workflows.
Demonstrated enthusiasm for AI advancements in software development.

Compensation & Logistics

This is a contract, part-time position with flexible hours and remote work allowed worldwide. Time commitment is under 20 hours per week and includes occasional concentrated 4-day testing periods.

Pay is hourly (PAY_PER_HOUR) with a listed range up to USD 90/hour (hourly range: $30–$90/hr). Language of the project is English.

Employment type: Contractor, Part-time.
Time requirement: Less than 20 hours/week, including intermittent intensive testing bursts.
Data focus: Computer code / programming evaluation and rating tasks.

How To Apply

If this aligns with your skills and availability, apply through OpenTrain and include your resume, links to relevant GitHub or project work, and a brief note about your experience using AI coding tools (Cursor familiarity is a plus).

Shortlisted candidates will be contacted to discuss availability, confidentiality expectations, and next steps for the testing bursts and onboarding.