LLM Trainer — Outlier AI, Remote
Performed LLM training and evaluation by building optimal Python/C++ tooling and writing unit tests to validate model-related algorithms. Conducted side-by-side model response evaluations and used rubric frameworks to assess LLM code comprehension within large repositories. • Authored unit tests with Pytest/Catch2 to validate algorithms • Ranked LLM responses by code quality, correctness, and scalability • Developed question/rubric frameworks using Git for evaluation • Supported LLM training workflows through structured assessment and comparison