SRE / Incident Ops Evaluator for AI Agent Training

Use your SRE or incident-ops experience to label B2B SaaS incident scenarios for AI-agent training — remote, contract, $40/hr, ~20 hrs/week for ~2.5 months. Strong English writing and consistent rubric application required.

Generative AI & RLHF

Remote Hourly · $40–$40/hr

$40–$40/hr

Compensation

5 countries

Eligibility

Intermediate

Experience

May 18, 2026

Posted

Open to applicants in

United States

Canada

United Kingdom

Ireland

Australia

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for building careers in AI training and data labeling. Contributors find projects, build profiles, and apply quickly to work that helps shape how modern AI behaves.

We focus on real human work that trains and evaluates AI systems. This opportunity connects experienced SRE and incident-ops professionals with a remote, flexible contract project where your judgment directly improves AI agents used in enterprise environments.

About AI training work

AI training (data labeling or human feedback work) is the human side of building smarter models: people create, evaluate, and label the examples AI learns from. Projects can involve reading logs, evaluating post-incident artifacts, rating responses, and applying structured rubrics to ambiguous scenarios.

These roles are often remote and flexible, accessible to practitioners with domain experience rather than deep coding skills, and give you a direct say in how AI systems learn to act in production situations.

The role

We are hiring experienced SRE and incident operations professionals to review structured B2B SaaS workplace scenarios and label whether specific information is appropriate for an AI agent to use and who should see it.

This is a remote, hourly contract role: $40 USD per hour, roughly 20+ hours per week, estimated duration about 2.5 months. Schedule is flexible; you must be available to meet agreed turnaround targets and apply rubrics consistently under time pressure.

Hourly pay: $40 USD per hour
Estimated workload: ~20 hours/week
Estimated duration: approximately 2.5 months
Employment type: Contractor, Part-time
Eligible countries: US, CA, GB, IE, AU; English required

What you'll do

Review structured workplace incident scenarios containing logs, system symptoms, service ownership, on-call decisions, and post-incident artifacts. For each item you will mark whether it is appropriate for an AI agent to use and whether it should be visible to different actors in the scenario.

Apply a provided rubric to produce labels (yes / no / undecidable) and write a short, defensible justification for each judgment. Work involves careful reasoning, consistent application of least-privilege principles, and clear written communication.

Read scenario context (logs, alerts, runbooks, postmortems) and judge information appropriateness
Label items with classification and evaluation ratings, and answer short QA-style prompts
Provide concise justifications for yes/no/undecidable labels
Apply least-privilege judgment about visibility to roles in the scenario
Follow structured rubrics and maintain consistency across ambiguous, time-pressured items

Requirements

Do not apply unless you meet the core experience and skill requirements below. You do not need to be an expert coder, but you must be comfortable reading basic technical context and making defensible judgments.

We will verify that candidates can write clearly in English and reason through incident scenarios using structured criteria.

Education: Bachelor's degree or equivalent on-the-job SRE/ops experience
Experience: 1+ year in SRE, production operations, incident management, or technical ops in a B2B SaaS environment
Domain expertise: incident response, escalation paths, postmortems, runbooks, access provisioning under time pressure
Tools/skills: familiarity with paging/alerting systems, runbooks, observability/logs; comfort reading logs and system symptoms
Communication: strong English writing and clear, concise reasoning
Role skills: consistent rubric application, least-privilege judgments, defensible decisions in ambiguous contexts

Who should apply

Experienced on-call engineers, SREs, production support, incident commanders, or technical ops professionals who enjoy applied judgment work and clear writing should consider this role.

This is ideal if you want flexible, remote contract work that uses your operational knowledge to teach AI systems how to make safer, more practical decisions in real enterprise situations.

You prefer contract, part-time remote work with flexible hours
You can produce consistent labels and short justifications under time constraints
You want to help shape AI behavior in B2B operational contexts

How it works

If selected you will receive a training rubric and example tasks, then complete labeling work in the project's platform (OTHER). Tasks are text-based and include classification, evaluation ratings, and short question-answer prompts.

Apply through OpenTrain to build a profile and submit. Successful applicants will complete a brief qualification test to demonstrate rubric understanding and written reasoning before starting paid work.

Data type: text-only scenarios
Label types: classification, evaluation_rating, question_answering
(project-specific platform)
Application steps: Create OpenTrain profile, apply, complete qualification test, start contract work