Document Data Extraction Specialist (Law & Finance)
Work remotely validating AI-generated draft outputs to extract structured data from legal, financial, and compliance PDFs/DOCX into JSON schemas. Part-time contractor role (under 20 hrs/week), pay $8.00–$11.20 USD/hr; intermediate level and near-native English required.
Legal & Finance
$8–$11.2/hr
Compensation
Worldwide
Eligibility
Intermediate
Experience
May 14, 2026
Posted
Open worldwide
About OpenTrain
OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. Creating an OpenTrain account is free — it connects you to real projects where people prepare and refine the examples that modern AI models learn from.
We focus on helping contributors start and grow careers teaching AI: find projects, build a profile, and apply in minutes. Many contributors work remotely and part time, shaping how state-of-the-art systems behave.
About AI training and this work
AI training (often called data labeling, annotation, or human feedback work) is the human side of building AI. This project combines careful reading of documents with structured-data skills to make model inputs and evaluations accurate.
You will validate AI-generated draft outputs, correct fields to match source documents, and produce final JSON that downstream systems and teams can rely on — work that directly impacts model quality and real-world behavior.
The role
You will extract structured information from PDF and DOCX files across a mix of document types: legal contracts, patents, financial records, compliance filings, and similar materials. Each task includes three AI-generated draft outputs; your job is to select the best draft, validate and correct every field, and produce a fully accurate JSON-compliant output.
This is a contract, part-time project suited to someone with intermediate annotation or data-analysis experience and domain familiarity in law, finance, accounting, or compliance. Work is remote and available worldwide.
What you'll do day to day
- Read source PDF/DOCX documents thoroughly before extracting information to ensure nothing is missed.
- Understand and follow JSON schemas precisely, including required vs optional fields, types, arrays, objects, and nested structures.
- Review three AI-generated draft outputs, select the best, and edit it rather than starting from scratch.
- Validate every field against the source document and correct errors or omissions with high accuracy.
- Capture repeated or dynamic entries (line items, multiple clauses, parties, dates) and preserve array/object structure.
- Set required but truly missing fields to null in accordance with schema instructions.
- Write concise, original summaries in your own words where the schema requests a summary or description.
Requirements
You must preserve and meet every requirement listed here; we cannot accept substitutions or vague equivalents.
- Background in at least one of: legal, finance, accounting, compliance, or data analytics.
- Prior RLHF/annotation, data analysis, or accuracy-heavy data-entry experience.
- Solid JSON schema literacy and comfort working with nested structures and data types.
- Strong attention to detail when working with long, dense documents.
- Near-native English comprehension (C1/C2).
- Experience level: Intermediate.
- Time commitment: Less than 20 hours per week; contract, part-time engagement.
- Worldwide applicants accepted.
Compensation & logistics
Pay is per hour in USD. Rates range from $8.00 to $11.20 per hour, with the advertised top rate of $11.20/hr. This is a contractor role billed hourly.
Labeling setup: you'll work in a specialized annotation interface (OTHER) and the tasks are document-centric (PDF/DOCX). Label types include RLHF-style draft validation, data collection, and working with structured code-like outputs.
- Payment type: Pay per hour (USD).
- Typical weekly hours: under 20 hours/week, flexible scheduling.
- Employment types: Contractor, Part-time.
How to apply and next steps
Create a free OpenTrain account, complete your profile with relevant domain experience, and apply to this project. Your application should highlight any legal/finance/accounting experience, previous annotation or RLHF work, and JSON schema examples if available.
If selected you will receive project onboarding, sample tasks, and access to the annotation platform. Expect an initial accuracy review and feedback so you can reach the project's quality expectations quickly.