Skip to content
OpenTrain AI

Vibecode Specialist (Web Scraping & Data Extraction)

OpenTrain AI · Remote · Worldwide · Posted Jun 9, 2026

Apply for this job Hourly · $20/hr

**Dataset Description (5–8 words):** Structured data from complex websites

**Data Type (select one):** Text

**Subject Matter/Industry (5–8 words):** Web data extraction and automation

**Pre-labeled Data (Yes/No):** No

**Labeling Software:** Other

**Label Types (select at least 1):**

* Data Collection
* Computer Programming/Coding
* Fine-tuning
* Evaluation/Rating

**Labeling Overview:**
You should have hands-on experience with Python-based web scraping and data extraction from complex sites, including dynamic/JavaScript-rendered pages. You’ll be comfortable troubleshooting scraping failures, validating outputs, and delivering clean structured data. Upper-intermediate English (B2) or higher is required.

In this role, you’ll own end-to-end scraping workflows: extracting data across multi-level site structures, using a mix of internal tools (Apify, OpenRouter) and your own scripts/workflows. You’ll validate and normalize data, enforce formatting requirements, and deliver accurate structured datasets (e.g., CSV/JSON/Sheets). You’ll collaborate in a hybrid AI + human setup where AI agents handle repetitive steps and you provide quality control and critical thinking.

**Required Locations:** Global - Any Location

**Required English Level:** Fluent

**Other Qualifications & Requirements (5–10 bullets):**

* 1+ year experience in at least one: web scraping, data engineering, software development, automation, or data analysis
* Strong Python web scraping skills (e.g., BeautifulSoup + Selenium/Playwright or equivalents)
* Proven experience scraping dynamic/JS-heavy sites (infinite scroll, AJAX, JS-rendered content)
* Experience extracting from multi-level/hierarchical site structures (e.g., category → entity → details)
* Ability to handle changing site structures and implement resilient scraping strategies (selectors, fallbacks, retries)
* Ability to clean/normalize/validate scraped data and deliver in structured formats (CSV, JSON, Google Sheets)
* Experience with batching/parallelization for scaling large scraping jobs (or equivalent performance approaches)
* Familiarity using LLMs/AI tools to accelerate workflows (prompting, automation, extraction assistance)
* English level B2+ with ability to follow detailed specs and document edge cases clearly