Hebrew-English LLM Safety Evaluator
Remote contractor role evaluating and red-teaming large language models in Hebrew and English (20+ hrs/week). Earn $26–$38/hr (typical $32/hr) reviewing, scoring, and documenting safety failures to improve model behavior for a global AI data services team.
Generative AI & RLHF
$26–$38/hr
Compensation
Worldwide
Eligibility
Intermediate
Experience
Apr 3, 2026
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect experienced and new contributors with projects that help shape how modern AI systems behave, and we support flexible, remote work opportunities across the industry.
About AI training and safety work
AI training (also called data labeling or human feedback work) is the human side of building AI: people curate examples, rate outputs, and test model behavior so systems learn to be accurate and safe. Safety and trust work focuses on detecting, documenting, and fixing behaviors that could harm users or violate policy.
This role places you on the front lines of that effort: your evaluations, adversarial examples, and policy-minded feedback will directly influence how large language models are trained and made safer.
The role
We’re hiring a Hebrew & English LLM Safety Evaluator to work as a remote, hourly contractor (part-time, 20+ hours/week). You will review AI-generated responses, create safety-focused evaluation data in both languages, and stress-test models to uncover policy gaps.
Expect hands-on red teaming, consistent application of written safety policies, and regular documentation of safety failures. Some tasks will require reviewing explicit, toxic, violent, sexual, or otherwise disturbing content.
- Employment: Contractor, Part-time
- Hours: 20+ hours/week
- Pay: USD $26–$38/hr (posted hourly rate: $32/hr)
- Data type: Text; Label types: Evaluation rating, Text generation, RLHF
What you’ll do
Day-to-day work focuses on evaluating model outputs, curating adversarial examples, and producing clear, reproducible feedback that informs model training and policies.
- Review and score model responses in Hebrew and English for safety, accuracy, and policy compliance.
- Curate, label, and document adversarial or safety-sensitive training examples.
- Red-team models by probing boundaries and stress-testing for policy gaps and adversarial patterns.
- Document safety failures with clear explanations and reproducible steps.
- Collaborate with data teams to refine instructions, labels, and evaluation rubrics.
Requirements
Candidates must meet all listed language, education, and experience requirements and be able to perform work that may involve sensitive or explicit content.
- Near-native or native Hebrew proficiency (reading and writing).
- Minimum C1 English proficiency (reading and writing).
- Bachelor’s degree or higher in Communications, Linguistics, Psychology, Law/Policy, Security Studies, or equivalent professional experience.
- Proven experience in Trust & Safety, content moderation, policy enforcement, risk operations, investigations, or safety evaluation.
- Required hands-on LLM red teaming experience, including probing safety boundaries and documenting adversarial patterns.
- Strong knowledge of safety categories (hate, harassment, sexual content, self-harm, violence, bias, illegal goods/services, malicious activities/code, misinformation).
- Ability to apply written safety policies consistently and explain decisions clearly in ambiguous cases.
- Comfortable reviewing explicit, toxic, violent, sexual, or psychologically disturbing content.
- Practical experience with AI tools such as Perplexity, Gemini, ChatGPT, or similar systems.
- Prior experience with AI data training, annotation, or evaluation workflows is preferred.
Who should apply
This opportunity is a strong fit for intermediate-level professionals with trust & safety, moderation, or policy backgrounds who can work independently and communicate decisions clearly in both Hebrew and English.
You’ll do best if you enjoy analytical red-teaming work, are detail-oriented, and want to have a direct impact on improving the safety of large language models.
How it works
If selected you’ll work remotely as a contractor on hourly pay. Assignments include evaluation tasks, red-team exercises, and documentation deliverables. Work is distributed via the project's platform and may require following project-specific guidelines and rubrics.
OpenTrain supports people building careers in AI training and data labeling; creating an account is free and helps you manage applications, profile skills, and find similar projects.
- Worldwide applicants accepted; projects are remote.
- You will be paid hourly at the posted rate range; exact rate may vary by project.
- Tasks may involve sensitive content—applicants must be prepared for regular exposure.