Bilingual AI Safety Data Evaluator (English/Spanish)
Remote hourly contractor role evaluating AI safety and reasoning in English and Spanish; rate $14–$24/hr (typical $20/hr). Use your Trust & Safety, moderation, or red-teaming experience to label, review, and improve LLM safety across multilingual content.
Generative AI & RLHF
$14–$24/hr
Compensation
Worldwide
Eligibility
Intermediate
Experience
Apr 3, 2026
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect experienced contributors with projects that shape how modern AI systems behave — from moderation and safety to RLHF and multilingual evaluation.
Working via OpenTrain gives you direct access to paid, remote projects where your annotations and feedback improve real AI models. Creating an OpenTrain account is free.
About AI Training and Safety Work
AI training (also called data labeling or human feedback work) is the human layer behind modern machine learning. Teams rely on expert reviewers to judge model outputs for safety, accuracy, and fairness — especially in multilingual and cross-cultural contexts.
This role sits at the intersection of content policy, moderation, and adversarial testing: you will evaluate model responses, identify risky or subtle failure modes, and provide clear, reproducible rationales that inform model improvements.
The Role — What This Job Is
We are hiring a bilingual (Spanish and English) AI Safety Data Evaluator on a remote, hourly-paid contractor basis. You will review AI-generated text, rate outputs for safety and reasoning, perform red-teaming to surface edge cases, and apply nuanced policy judgments across both languages.
Label types include evaluation ratings, RLHF-style feedback, and text-generation review. This work may include exposure to explicit, violent, or otherwise disturbing content; emotional resilience and consistent judgment are essential.
- Employment type: Contractor (remote, worldwide applicants welcome)
- Data type: Text; Label types: EVALUATION_RATING, RLHF, TEXT_GENERATION
- Pay: Hourly, $14–$24 USD per hour (typical $20/hr)
Key Responsibilities
- Evaluate AI-generated outputs in English and Spanish for safety, factuality, logic, and clarity.
- Apply policy guidelines to label and quality-check safety data across multiple domains (hate, harassment, sexual content, self-harm, violence, illegal activity, misinformation, etc.).
- Perform adversarial testing / red-teaming to discover edge cases and recommend mitigations.
- Write clear, reproducible rationales for each moderation or safety decision to guide model improvements.
- Quality-check annotations, spot inconsistencies, and escalate ambiguous or high-risk content appropriately.
- Preserve meaning, severity, and intent when assessing content across Spanish and English (localization-aware judgment).
Minimum Requirements
- Near-native or native Spanish proficiency in reading and writing.
- Minimum C1 English proficiency in reading and writing.
- Bachelor’s degree or higher in a relevant field (Communications, Linguistics, Psychology, Law/Policy, Security Studies) or equivalent professional experience.
- 5+ years professional experience in Trust & Safety, content moderation, policy operations, risk/compliance, investigations, or related safety work.
- Proven LLM red-teaming or adversarial testing experience, including identifying edge cases and recommending mitigations.
- Strong working knowledge of safety domains: hate/harassment, sexual content, self-harm, violence, bias, illegal goods/services, malicious activities/code, and misinformation.
- Experience applying policy guidelines consistently across multilingual or cross-cultural content, especially Spanish and English.
- Strong analytical writing skills and ability to provide clear, reproducible rationales for moderation decisions.
- Comfortable reviewing explicit, toxic, violent, sexual, or psychologically disturbing content as part of daily work.
Preferred But Not Required
Localization or translation experience is preferred, especially work that preserves nuance, intent, and severity across languages.
Who Should Apply
Apply if you have substantial Trust & Safety or moderation experience, are bilingual Spanish/English at an advanced level, and have hands-on experience testing or evaluating LLM outputs. This role is a strong fit for policy operators, content moderators, safety analysts, and localization specialists who want to directly shape how AI handles sensitive multilingual content.
Compensation, Logistics, and Next Steps
This is a remote contractor role open to worldwide applicants. Compensation is hourly in USD with a range of $14–$24/hr and a typical rate around $20/hr. Projects usually set schedules and expected throughput; hours and exact project length will vary by engagement.
To apply, create a free OpenTrain account, complete your profile highlighting bilingual moderation and red-teaming experience, and submit examples or notes about relevant Trust & Safety work. The OpenTrain platform connects you directly to projects where your evaluations impact real AI models.