AI Infrastructure Automation Engineer

Join a remote, contract role evaluating and improving LLM-driven infrastructure automation: create prompts, rate and refine runbooks, and test reliability under load for complex DevOps workflows. Part-time (20+ hrs/week), $15–$45/hr; applicants need 2+ years DevOps/infrastructure experience and Engl

Generative AI & RLHF

100% Remote Hourly · $15–$45/hr

$15–$45/hr

Compensation

Worldwide

Eligibility

Intermediate

Experience

Mar 29, 2026

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for people who build careers in AI training and data labeling. We connect contributors with hands-on projects that teach AI systems how to behave — discover roles, build a profile, and apply in minutes.

Why AI Training Work Matters

AI training (data labeling and human feedback) is the human side of building intelligent systems: people create prompts, rate outputs, and review examples that modern models learn from.

This role puts you on the cutting edge of how infrastructure and automation behavior are taught to AI — flexible, remote work that fits around other commitments while directly shaping real-world systems.

The Role

We are seeking an experienced engineer to help train and evaluate AI systems focused on infrastructure design and scalable automation for high-volume workflows.

You will generate and refine prompts about deployment and scaling strategies, define evaluation criteria for uptime/monitoring/fault tolerance, and review or improve AI-generated DevOps runbooks and troubleshooting steps.

Employment type: Contractor, Part-time (20+ hours/week)
Pay: Hourly, USD $15–$45 per hour
Data type: Text; Label type: Evaluation rating;

What You’ll Do

Work directly with AI outputs that propose deployment, scaling, and failover strategies and judge their correctness and applicability to real-world environments.

Evaluate self-hosted automation platforms and assess solution reliability and performance under load, producing rubric-based ratings and feedback that guide model improvement.

Generate and refine prompts that exercise deployment and scaling scenarios
Define and apply evaluation rubrics for uptime, monitoring, and fault tolerance
Review, rate, and edit AI-written runbooks and troubleshooting instructions
Perform scenario-based assessments of reliability and performance for self-hosted systems

Requirements

You must meet the specific experience and documentation requirements below — we will verify details from your CV.

This role expects hands-on annotation/evaluation experience and domain expertise in infrastructure and DevOps.

2+ years of DevOps, infrastructure, or backend systems experience
Proven skill deploying and evaluating self-hosted environments
Experience with monitoring, uptime, and fault tolerance practices
Hands-on text annotation, evaluation, or rubric-based QA experience (repeated emphasis in project)
Experience evaluating LLM outputs for legal reasoning quality
Demonstrated English proficiency: B2 or higher
CV must be in English and state your English level, plus include email address and phone number

Who Should Apply

This project is best for intermediate engineers who combine DevOps experience with prior annotation or rubric-driven QA work and an interest in AI-guided automation.

If you enjoy translating operational knowledge into clear evaluation criteria and improving model-generated runbooks, you’ll be a strong fit.

Location & Eligibility

This is a remote role open to many countries, but there are acquisition and eligibility restrictions listed below — please confirm you are not in a restricted location before applying.

Restricted locations (applicants from these places cannot be accepted): Iran, Cuba, North Korea, Syria, Sudan, Venezuela, Myanmar, Russia, Belarus, Palestine
Also restricted: Switzerland; China, Taiwan; Kenya; and the following U.S. states: Alaska, Arkansas, California, Connecticut, Delaware, Georgia, Hawaii, Illinois, Indiana, Kansas, Louisiana, Maine, Maryland, Massachusetts, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, Ohio, Oregon, Tennes
Also excluded: Antarctica, Aruba, Åland Islands, Saint Barthélemy, Bonaire/Sint Eustatius and Saba, Bouvet Island, Cocos (Keeling) Islands, Democratic Republic of the Congo, Cook Islands, Christmas Island, Western Sahara, Falkland Islands (Malvinas), French Guiana, Guadeloupe, South Georgia and the

How To Apply

To apply, submit a CV in English that states your English proficiency level, and includes an email address and phone number. Include clear notes on your DevOps/self-hosted deployment experience and any annotation or rubric-based QA work.

Applications will be evaluated for domain fit and annotation experience; selected candidates may receive small qualification tasks to verify rating consistency before project work.

Prepare a CV in English including your contact email, phone number, and English level (B2+ required)
Be ready to complete qualification checks that assess your annotation and evaluation skills