AI Red Teaming and Prompt-Injection Security Expert

Work remotely as an expert in LLM red teaming and prompt-injection security on a 20+ hr/week contract paying $50–$90/hr. Lead adversarial testing, build regression suites, and help improve model safety for next-generation AI systems.

Generative Ai Rlhf

100% Remote Hourly · $50–$90/hr

$50–$90/hr

Compensation

Worldwide

Eligibility

Expert

Experience

Jun 30, 2026

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain

OpenTrain is the #1 platform for people who build careers in AI training and data labeling. We help AI specialists find and manage projects, consolidate work history and portfolios, and grow durable freelance careers in the human side of AI development.

We connect skilled contributors with meaningful safety and evaluation work across the industry while supporting fair pay, transparent project terms, and a collaborative community of specialists.

About AI training and red teaming

AI training (data labeling, annotation, and human-feedback work) is how modern models learn from real human examples and judgments. Red teaming and adversarial evaluation are essential specialties: contributors craft adversarial prompts, jailbreaks, and attack scenarios that reveal weaknesses and guide safer model behavior.

This work is remote-friendly, flexible, and impactful—your tests and reports directly influence model robustness and product risk mitigation across real-world deployments.

The role

OpenTrain is recruiting AI Red Teaming and Prompt Injection Security Experts for a remote, part-time contract. This initiative focuses on improving AI safety and robustness through rigorous adversarial evaluation and test-suite development.

Commitment: 20+ hours/week. Employment type: Contractor, Part-time. Pay: $50–$90 USD per hour (top rate up to $90/hr). Work type: text-focused evaluation and red-teaming; label types include evaluation ratings and red teaming. Tooling: other/custom labeling software may be used. Language required: English. This role is open worldwide.

Work remotely, 20+ hours per week as a contractor.
Pay range: $50–$90 USD per hour (final rate depends on experience and scope).
Primary data: text-based evaluation and adversarial prompt testing.

What you'll do day-to-day

You will design, implement, and maintain adversarial evaluation programs to find and reproduce jailbreaks, prompt-injection, and tool-abuse patterns. Deliverables include test suites, evaluation frameworks, and clear written reports that translate findings into actionable safety improvements.

Design and implement methodologies for LLM red teaming, prompt injection, ethical jailbreaks, and tool-use abuse scenarios.
Create cross-domain elicitation strategies to uncover multi-turn and complex adversarial bypasses.
Develop and maintain regression test suites to track jailbreak susceptibility and prompt-injection vulnerabilities.
Build evaluation frameworks to stress-test models against real-world adversarial threats.
Collaborate with technical stakeholders to translate findings into mitigations and product improvements.
Document methodologies, results, and recommendations in clear reports for technical and non-technical audiences.

Requirements

To be considered you must have demonstrable expertise in adversarial machine learning, LLM red teaming, AI safety evaluation, or a closely related security domain, plus proven practical experience uncovering vulnerabilities such as ethical jailbreaks or prompt injection.

Expert-level experience in adversarial ML, LLM red teaming, or AI safety evaluation (required).
Proven track record researching, testing, or uncovering vulnerabilities related to jailbreaks, prompt injection, tool-use abuse, or adversarial attacks.
Strong written and verbal communication, with the ability to produce clear documentation and collaborate across teams.
Comfort working with text data, custom labeling tools, and structured test/regression suites.

Helpful background (preferred)

The following are advantageous but not strictly required: advanced academic credentials in CS, ML, or security; recognized contributions to the AI security community; or prior cross-disciplinary AI-safety project experience.

MS/PhD in computer science, cybersecurity, machine learning, or equivalent operational experience.
Published research, open-source tools, or conference talks in adversarial ML or prompt security.
Familiarity with contemporary LLM architectures, prompt engineering, and security assessment tools.
Experience contributing to multi-disciplinary or cross-functional AI safety initiatives.

How it works and how to apply

Apply through OpenTrain by creating a profile and submitting your experience, samples, or links that demonstrate your red teaming work. Candidates may be asked to complete a screening task or provide past reports that verify expertise.

If selected, you will be engaged as a contractor on a part-time schedule. Compensation is hourly within the stated range and tied to experience and deliverables.

Create an OpenTrain profile and list relevant red teaming or security work samples.
Be prepared for a short technical screening and to commit to 20+ hours/week.
Compensation will be negotiated within the $50–$90/hr range based on experience and scope.