For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
T

Thomas H.

AI expert - LLM Evaluation and Benchmarking

USA flagRedwood City, Usa

Key Skills

Software

Snorkel AISnorkel AI
Other
Internal/Proprietary Tooling
Don't disclose

Top Subject Matter

Llms Domain Expertise
Prompt Engineering
Evaluation Domain Expertise

Top Data Types

TextText
DocumentDocument
ImageImage

Top Task Types

ClassificationClassification
Question AnsweringQuestion Answering
Computer Programming/CodingComputer Programming/Coding
Evaluation/RatingEvaluation/Rating
Data CollectionData Collection
Fine-tuningFine-tuning
Text SummarizationText Summarization
Text GenerationText Generation
Object DetectionObject Detection
RLHFRLHF
Function CallingFunction Calling
TranscriptionTranscription
Red TeamingRed Teaming

Freelancer Overview

AI DevOps - LLM Evaluation and Benchmarking. Brings 7+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Snorkel AI and Other. Education includes Master of Science, Worldquant University (2027) and Bachelor of Arts, Denison University (2025). AI-training focus includes data types such as Text and labeling workflows including Evaluation, Rating, and Classification.

Labeling Experience

Snorkel AI

AI DevOps - LLM Evaluation and Benchmarking

Snorkel AISnorkel AITextText

Designed and specified multi-step agentic benchmark tasks to evaluate Large Language Model (LLM) performance. Wrote precise task specifications and scoring criteria to ensure comprehensive assessment of models. Analyzed model failure modes and developed solutions for improved evaluation quality. • Developed new benchmarks for LLM assessment • Authored clear rubrics and evaluation frameworks • Utilized prompt engineering to guide AI responses • Collaborated with DevOps to automate test suites and validate outcomes

2025 - Present

Research Intern - Disaster Data Labeling

OtherTextTextClassificationClassification

Created and implemented BERT-based and LDA models to analyze a large dataset of social media posts during Hurricane Harvey. Facilitated the categorization and labeling of over 400,000 tweets to improve disaster impact analysis. Applied data annotation practices integral to Natural Language Processing (NLP) workflows. • Developed and applied topic modeling for classification • Enhanced dataset quality for AI/ML research • Leveraged NLP techniques for emotion and topic labeling • Contributed annotated data to research publications

2024 - 2024

Education

W

Worldquant University

Master of Science, Financial Engineering

Master of Science
2025 - 2027
D

Denison University

Bachelor of Arts, Computer Science and Applied Mathematics

Bachelor of Arts
2021 - 2025

Work History

S

Snorkel AI

AI DevOps Engineer

Redwood City
2025 - Present
C

Climate Change Health Intelligence Lab

Research Intern

Louisville
2024 - 2024