Skip to content
OpenTrain AIFor AI Companies

PDF Structuring & Annotation Specialist (C1/C2 English)

Annotate and structure PDF documents: draw bounding boxes, label document elements, transcribe text exactly, describe images, and convert tables into JSON. Entry-level remote contract work paying $7/hr, flexible under 20 hrs/week for detail-oriented candidates with C1/C2 English and basic HTML/JSON

OpenTrain AI

General Annotation

100% Remote Hourly · $7/hr

$7/hr

Compensation

Worldwide

Eligibility

Entry

Experience

Mar 31, 2025

Posted

Open worldwide

Interested in this role?

Create a free OpenTrain account and apply in minutes.

About OpenTrain

OpenTrain is the #1 platform for finding and building careers in AI training and data labeling. We connect people with projects that teach AI systems how to read, see, and understand the world, helping contributors start and grow careers in this fast-growing industry.

About AI Training Work

AI training (also called data labeling or annotation) is the human work behind modern AI: people create and verify examples that models learn from. These projects are often fully remote, flexible, accessible to beginners, and let you work on concrete tasks — from transcribing text to marking up document layouts — that directly shape how AI behaves.

The Role

We are seeking detail-oriented freelancers to annotate PDF documents and produce clean, structured representations of their content and layout. This is an entry-level, contract, part-time role (less than 20 hours/week) open worldwide. You will use bounding boxes, classification labels, exact transcription, and JSON/HTML-style structuring to capture each document’s hierarchy and content.

  • Employment type: Contractor, Part-time
  • Time requirement: Less than 20 hours/week
  • Pay: USD 7 per hour (PAY_PER_HOUR)
  • Data type: DOCUMENT; Label types: BOUNDING_BOX, TEXT_GENERATION
  • (details provided after selection)

What You’ll Do

Deliver accurate, structured annotations for PDFs so downstream systems can reconstruct layout and content. Work focuses on identifying elements, marking their positions, transcribing visible text, describing images, and converting tables to JSON.

  • Draw bounding boxes around elements such as headings, paragraphs, figures, charts, tables, and images
  • Assign element categories and document hierarchy positions (section, subsection, etc.)
  • Transcribe all visible English text exactly as shown
  • Describe images and figures with concise, factual descriptions
  • Convert tables into well-formatted JSON arrays including row/column structure and cell content

Requirements

You must meet the following qualifications exactly as stated.

  • C1/C2 level English proficiency (required) — you will need to provide an official certificate after applying and before starting
  • Experience with document annotation or transcription
  • Familiarity with document structures (sections, subsections, headers) and layout hierarchy
  • Ability to accurately draw bounding boxes and label elements
  • Basic knowledge of HTML or JSON formatting
  • Ability to transcribe text exactly as it appears
  • Skill in identifying and describing figures, charts, and images
  • Experience converting tables into structured JSON arrays

Who Should Apply

This project is ideal for careful, visual-minded contributors who are fluent in advanced English and comfortable with basic structured data formats. Prior annotation, transcription, or editorial experience will help you be successful immediately.

  • Entry-level applicants are welcome if you meet the listed requirements
  • Worldwide applicants accepted

Interview & Test Task

During the interview you will be given a short practical test to complete in JSON format. You must submit the JSON response to the test task before completing the live chat interview.

  • Test Task: Table Annotation (JSON Format)
  • Task description: Convert the sample PDF table into a structured JSON object with rowCount, columnCount, and a cells array.
  • Required cell fields: kind ("columnHeader" or "bodyCell"), rowIndex (0 for header), columnIndex (0-based), content (text).
  • Sample table to convert:
  • Table 1: AI Adoption Rates by Industry
  • Industry | Adoption Rate
  • Healthcare | 76%
  • Finance | 65%
  • Retail | 48%
  • Deliverable: a clean JSON object matching the exact structure described.

How To Apply

Apply through the OpenTrain platform and include a brief summary of your annotation experience and any sample work or links if available. After you apply, you will be asked to provide proof of C1/C2 English proficiency before starting the project. Selected candidates will receive details about the labeling tool (OTHER) and project-specific guidelines.

  • Include any past annotation/transcription projects in your application
  • Be prepared to complete the JSON test during the interview
  • You will receive project documentation and examples once selected