AI Safety LLM Trainer (Korean C1+ English Required)
Remote contractor role evaluating AI-generated Korean and English text to improve model safety and policy compliance; $28–$38/hr, 20+ hours/week. Ideal for senior Trust & Safety professionals with LLM red‑teaming experience and near‑native Korean plus C1 English.
Generative AI & RLHF
$28–$38/hr
Compensation
Worldwide
Eligibility
Intermediate
Experience
Apr 3, 2026
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect people to projects where they teach and shape AI systems—discover opportunities, build a profile, and apply quickly. Creating an OpenTrain account is free.
About AI training and this role
AI training (data labeling / human feedback) is the human side of building AI: people annotate, evaluate, and critique model outputs so systems behave safely and usefully. This role focuses on safety and policy evaluation for large language models (LLMs) across Korean and English content.
You will help major AI teams improve model safety by reviewing outputs, applying policy consistently across languages, and documenting clear rationales for moderation and mitigation recommendations.
The role
Title: AI Safety Data Reviewer / LLM Trainer (contract, part‑time). This is remote, hourly work with exposure to sensitive content. You will evaluate AI‑generated text for safety, policy compliance, factual accuracy, and reasoning quality across Korean and English.
- Employment type: Contractor, Part‑time
- Schedule: 20+ hours per week (flexible)
- Pay: USD $28–$38 per hour (typical rate shown: $32/hr)
- Data type: Text; label types include EVALUATION_RATING, QUESTION_ANSWERING, TEXT_GENERATION, and RLHF
What you'll do
You will review and label model outputs, rate multiple responses, assess alignment with safety policies, and write clear, reproducible rationales for moderation decisions. Work includes identifying methodological or conceptual errors and recommending mitigations based on adversarial findings.
- Evaluate AI‑generated text for safety, policy alignment, reasoning quality, and factual accuracy
- Rate and compare multiple model outputs and provide structured feedback for RLHF
- Spot edge cases, adversarial attacks, and cultural nuance across Korean and English
- Supervise or support content moderation decisions and document reproducible rationales
Requirements
You must meet the language, education, and experience qualifications below. The work involves handling explicit, toxic, violent, sexual, or psychologically disturbing content in a secure remote environment.
- Near‑native or native Korean proficiency in reading and writing
- Minimum C1 English proficiency in reading and writing
- Bachelor’s degree or higher in Communications, Linguistics, Psychology, Law/Policy, Security Studies, or equivalent professional experience
- Senior‑level experience in Trust & Safety, content moderation, policy operations, risk/compliance, investigations, or related safety functions
- Proven LLM red‑teaming or adversarial testing experience, including identifying edge cases and recommending mitigations
- Strong knowledge of safety domains: hate/harassment, sexual content, self‑harm, violence, bias, illegal goods/services, malicious activities/code, and misinformation
- Experience applying policy consistently across Korean and English, including cultural nuance, slang, and coded language
- Localization or translation experience preferred (preserving meaning, severity, and intent across languages)
- Strong analytical writing skills with clear, reproducible rationales
Who should apply
Apply if you are a senior Trust & Safety or policy professional with bilingual Korean/English skills and hands‑on LLM testing experience. This role fits people who want flexible, remote, impactful work improving how AI handles sensitive content.
- Experienced moderators, policy reviewers, risk/compliance analysts, or safety researchers with LLM red‑teaming experience
- Bilingual professionals comfortable judging nuance across Korean and English
- People who can write clear, reproducible moderation rationales and work independently
How the work is delivered
This is hourly, remote contractor work delivered through the project’s annotation platform (). You will follow project policies and quality guidelines, submit ratings and written justifications, and may participate in calibration or training sessions.
- Hourly pay, paid in USD; typical displayed rate $32/hr within a $28–$38 range
- Work is remote and worldwide-eligible; you must be able to work securely and handle sensitive content
- Labeling tasks include evaluation ratings, question answering checks, text generation reviews, and RLHF feedback