Hebrew-English LLM Safety Evaluator

Remote contractor role evaluating and red-teaming large language models in Hebrew and English (20+ hrs/week). Earn $26–$38/hr (typical $32/hr) reviewing, scoring, and documenting safety failures to improve model behavior for a global AI data services team.

Generative AI & RLHF

100% Remote Hourly · $26–$38/hr

$26–$38/hr

Compensation

Worldwide

Eligibility

Intermediate

Experience

Apr 3, 2026

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect experienced and new contributors with projects that help shape how modern AI systems behave, and we support flexible, remote work opportunities across the industry.

About AI training and safety work

AI training (also called data labeling or human feedback work) is the human side of building AI: people curate examples, rate outputs, and test model behavior so systems learn to be accurate and safe. Safety and trust work focuses on detecting, documenting, and fixing behaviors that could harm users or violate policy.

This role places you on the front lines of that effort: your evaluations, adversarial examples, and policy-minded feedback will directly influence how large language models are trained and made safer.

The role

We’re hiring a Hebrew & English LLM Safety Evaluator to work as a remote, hourly contractor (part-time, 20+ hours/week). You will review AI-generated responses, create safety-focused evaluation data in both languages, and stress-test models to uncover policy gaps.

Expect hands-on red teaming, consistent application of written safety policies, and regular documentation of safety failures. Some tasks will require reviewing explicit, toxic, violent, sexual, or otherwise disturbing content.

Employment: Contractor, Part-time
Hours: 20+ hours/week
Pay: USD $26–$38/hr (posted hourly rate: $32/hr)
Data type: Text; Label types: Evaluation rating, Text generation, RLHF

What you’ll do

Day-to-day work focuses on evaluating model outputs, curating adversarial examples, and producing clear, reproducible feedback that informs model training and policies.

Review and score model responses in Hebrew and English for safety, accuracy, and policy compliance.
Curate, label, and document adversarial or safety-sensitive training examples.
Red-team models by probing boundaries and stress-testing for policy gaps and adversarial patterns.
Document safety failures with clear explanations and reproducible steps.
Collaborate with data teams to refine instructions, labels, and evaluation rubrics.

Requirements

Candidates must meet all listed language, education, and experience requirements and be able to perform work that may involve sensitive or explicit content.

Near-native or native Hebrew proficiency (reading and writing).
Minimum C1 English proficiency (reading and writing).
Bachelor’s degree or higher in Communications, Linguistics, Psychology, Law/Policy, Security Studies, or equivalent professional experience.
Proven experience in Trust & Safety, content moderation, policy enforcement, risk operations, investigations, or safety evaluation.
Required hands-on LLM red teaming experience, including probing safety boundaries and documenting adversarial patterns.
Strong knowledge of safety categories (hate, harassment, sexual content, self-harm, violence, bias, illegal goods/services, malicious activities/code, misinformation).
Ability to apply written safety policies consistently and explain decisions clearly in ambiguous cases.
Comfortable reviewing explicit, toxic, violent, sexual, or psychologically disturbing content.
Practical experience with AI tools such as Perplexity, Gemini, ChatGPT, or similar systems.
Prior experience with AI data training, annotation, or evaluation workflows is preferred.

Who should apply

This opportunity is a strong fit for intermediate-level professionals with trust & safety, moderation, or policy backgrounds who can work independently and communicate decisions clearly in both Hebrew and English.

You’ll do best if you enjoy analytical red-teaming work, are detail-oriented, and want to have a direct impact on improving the safety of large language models.

How it works

If selected you’ll work remotely as a contractor on hourly pay. Assignments include evaluation tasks, red-team exercises, and documentation deliverables. Work is distributed via the project's platform and may require following project-specific guidelines and rubrics.

OpenTrain supports people building careers in AI training and data labeling; creating an account is free and helps you manage applications, profile skills, and find similar projects.

Worldwide applicants accepted; projects are remote.
You will be paid hourly at the posted rate range; exact rate may vary by project.
Tasks may involve sensitive content—applicants must be prepared for regular exposure.