Video Moment Annotation For Food Clips

Annotate cooking and food videos by marking precise start/end times for actions, classifying action types and objects, and writing visually grounded descriptions for CLIP training. Contract, remote, worldwide work paid at $0.05 per labeled moment using Label Studio.

Image & Video Annotation

100% Remote Per task · $0.05/label

$0.05/label

Compensation

Worldwide

Eligibility

Entry

Experience

Nov 19, 2025

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

Apply now

About OpenTrain AI

OpenTrain AI is the hiring and contracting organization for this role and the #1 platform for building careers in AI training and data labeling. Creating an OpenTrain account is free and gives you access to training projects, skill-building, and new contract opportunities in the rapidly growing AI training industry.

Why This Work Matters

Modern AI models learn from human-created examples. Your annotations help teach models to find and describe moments in cooking and food videos—improving search, recommendation, and multimodal image-text systems.

This work is flexible and remote, ideal for contributors who want part-time or contract work that directly shapes how AI understands everyday visual activities in the kitchen.

The Role

You will identify and mark temporal video segments that match natural language queries within food and cooking videos, add domain metadata, and write clear visual proxy descriptions for CLIP-style training.

Employment type: Contractor (OpenTrain AI contracts directly).
Data type: Video. Labeling software: Label Studio.
Label types: Action recognition and classification.
Pay model: Pay-per-label at $0.05 per labeled moment.
Worldwide applicants accepted.

What You’ll Do Day to Day

Follow the project query, watch video clips, and mark every segment that matches the query using Label Studio. For each segment you label, set precise start and end times, choose the correct action type and objects, and write a concise visual proxy description.

Read and understand the natural language query before watching the clip.
Mark precise start/end times for each matching moment (boundary accuracy required).
Classify the action (e.g., chopping, sautéing, plating) and list visible objects.
Write a CLIP-friendly visual description focused on what you see, not inferred context.
Assign a confidence level (High/Medium/Low) based on boundary certainty.

Quality Guidelines You Must Follow

High-quality, consistent annotations are critical. Follow the project rules for boundary accuracy, coverage, and descriptive text exactly as provided in the task guide.

Boundary accuracy: mark within 0.5 seconds of the actual moment for High confidence.
Complete coverage: label all segments that match the query, not only the first.
Visual descriptions must be grounded and specific (e.g., "person in white chef coat slicing red tomatoes").
Avoid vague, abstract, or evaluative language (do not write "delicious" or assume off-screen facts).
Common mistakes: marking segments too short or too long, vague descriptions, and missing additional matching segments.

Requirements

This project requires data labeling experience and a graduate degree with an understanding of food videos; both are listed as the role's stated requirements. Follow all task instructions and use Label Studio for annotations.

Education: Graduate degree with familiarity with food/cooking videos (explicit requirement).
Experience: Some prior data labeling or annotation experience preferred.
Attention to detail: ability to mark temporal boundaries precisely and write clear visual descriptions.
Equipment: Internet access and a device capable of running Label Studio (specific hardware not prescribed).

How It Works — Apply and Start

Create an OpenTrain account (free), complete any required qualification tasks, and apply to this project through your OpenTrain profile. Once accepted, you'll receive access to Label Studio and the project's task queue and instructions.

Work is paid per labeled moment at $0.05; follow payout instructions in your OpenTrain account.
You choose your hours and workload within the project's availability; this is contract, remote work.
Maintain quality: repeated low-quality annotations may affect your access to future tasks.

Keep exploring

Similar Jobs

View all jobs

Egocentric Video Annotator — C1 English

OpenTrain is hiring part-time contractor annotators to create dense, multi-tier captions and sub-second trajectory labels for first-person video. C1 English, strong timestamping skills, and availability for a February pilot required; pay is USD 8/hour.

Apply now View job

Image & Video Annotation

Video

Remote · Worldwide

Part-time · Flexible

Intermediate level

Hourly · $8/hr

Posted Feb 2, 2026

Scene Cut Annotation — Temporal Segmentation for YouTube Videos

Contractors wanted to mark scene cuts and transitions in ~1,000 YouTube videos (5–10 min each) using uLabel; project runs 4–6 weeks with a preferred team of 10–15 annotators. Pay is $4/hour — applicants must show prior video annotation experience and a QA process.

Apply now View job

Image & Video Annotation

Video

Remote · Worldwide

Flexible hours

Intermediate level

Hourly · $4/hr

Posted Sep 14, 2025

Soccer Video Captioner — Action Labeling

Annotate short soccer video clips by identifying gameplay actions and writing clear 1–2 sentence captions; requires strong soccer knowledge, good English, and attention to detail. Remote, part-time contract work (under 20 hrs/week) at USD 3.50/hour.

Apply now View job

Image & Video Annotation

Video

Remote · Worldwide

Part-time · Flexible

Intermediate level

Hourly · $3.5/hr

Posted Jul 1, 2025

Explore related categories

Image & Video Annotation Generative AI & RLHF Coding & Software Audio & Speech