AI Safety LLM Trainer (Korean C1+ English Required)

Remote contractor role evaluating AI-generated Korean and English text to improve model safety and policy compliance; $28–$38/hr, 20+ hours/week. Ideal for senior Trust & Safety professionals with LLM red‑teaming experience and near‑native Korean plus C1 English.

Generative AI & RLHF

100% Remote Hourly · $28–$38/hr

$28–$38/hr

Compensation

Worldwide

Eligibility

Intermediate

Experience

Apr 3, 2026

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect people to projects where they teach and shape AI systems—discover opportunities, build a profile, and apply quickly. Creating an OpenTrain account is free.

About AI training and this role

AI training (data labeling / human feedback) is the human side of building AI: people annotate, evaluate, and critique model outputs so systems behave safely and usefully. This role focuses on safety and policy evaluation for large language models (LLMs) across Korean and English content.

You will help major AI teams improve model safety by reviewing outputs, applying policy consistently across languages, and documenting clear rationales for moderation and mitigation recommendations.

The role

Title: AI Safety Data Reviewer / LLM Trainer (contract, part‑time). This is remote, hourly work with exposure to sensitive content. You will evaluate AI‑generated text for safety, policy compliance, factual accuracy, and reasoning quality across Korean and English.

Employment type: Contractor, Part‑time
Schedule: 20+ hours per week (flexible)
Pay: USD $28–$38 per hour (typical rate shown: $32/hr)
Data type: Text; label types include EVALUATION_RATING, QUESTION_ANSWERING, TEXT_GENERATION, and RLHF

What you'll do

You will review and label model outputs, rate multiple responses, assess alignment with safety policies, and write clear, reproducible rationales for moderation decisions. Work includes identifying methodological or conceptual errors and recommending mitigations based on adversarial findings.

Evaluate AI‑generated text for safety, policy alignment, reasoning quality, and factual accuracy
Rate and compare multiple model outputs and provide structured feedback for RLHF
Spot edge cases, adversarial attacks, and cultural nuance across Korean and English
Supervise or support content moderation decisions and document reproducible rationales

Requirements

You must meet the language, education, and experience qualifications below. The work involves handling explicit, toxic, violent, sexual, or psychologically disturbing content in a secure remote environment.

Near‑native or native Korean proficiency in reading and writing
Minimum C1 English proficiency in reading and writing
Bachelor’s degree or higher in Communications, Linguistics, Psychology, Law/Policy, Security Studies, or equivalent professional experience
Senior‑level experience in Trust & Safety, content moderation, policy operations, risk/compliance, investigations, or related safety functions
Proven LLM red‑teaming or adversarial testing experience, including identifying edge cases and recommending mitigations
Strong knowledge of safety domains: hate/harassment, sexual content, self‑harm, violence, bias, illegal goods/services, malicious activities/code, and misinformation
Experience applying policy consistently across Korean and English, including cultural nuance, slang, and coded language
Localization or translation experience preferred (preserving meaning, severity, and intent across languages)
Strong analytical writing skills with clear, reproducible rationales

Who should apply

Apply if you are a senior Trust & Safety or policy professional with bilingual Korean/English skills and hands‑on LLM testing experience. This role fits people who want flexible, remote, impactful work improving how AI handles sensitive content.

Experienced moderators, policy reviewers, risk/compliance analysts, or safety researchers with LLM red‑teaming experience
Bilingual professionals comfortable judging nuance across Korean and English
People who can write clear, reproducible moderation rationales and work independently

How the work is delivered

This is hourly, remote contractor work delivered through the project’s annotation platform (). You will follow project policies and quality guidelines, submit ratings and written justifications, and may participate in calibration or training sessions.

Hourly pay, paid in USD; typical displayed rate $32/hr within a $28–$38 range
Work is remote and worldwide-eligible; you must be able to work securely and handle sensitive content
Labeling tasks include evaluation ratings, question answering checks, text generation reviews, and RLHF feedback