Katie H.

AI Model Evaluation Analyst - Atlas AI Services

New York, Usa

Key Skills

Software

Data Annotation Tech

Top Subject Matter

LLM response evaluation for safety

Quality Domain Expertise

and user-intent alignment

Top Data Types

Text

Document

Top Task Types

Text Generation

Object Detection

Text Summarization

Bounding Box

Evaluation/Rating

Data Collection

Freelancer Overview

AI Model Evaluation Analyst - Atlas AI Services. Brings 9+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal, Proprietary Tooling, and Python and SQL. Education includes Master of Science, New York Institute of Technology (2018). AI-training focus includes data types such as Text and labeling workflows including Evaluation and Rating.

Labeling Experience

AI Model Evaluation Analyst - Atlas AI Services

Text

Evaluated and scored AI-generated text responses across multiple customer-facing and knowledge domains using structured rubrics. Assessed correctness, completeness, tone, safety, and user-intent alignment while identifying failure modes such as hallucinations and ambiguous answers. Provided written feedback that was used in reinforcement learning and supervised fine-tuning cycles to improve production model reliability.• Reviewed AI-generated navigation, customer support, general knowledge, and task-execution responses.• Applied rubric-based scoring to rate quality and alignment criteria for each response.• Flagged edge cases impacting user trust, including hallucinations and ambiguous output.• Authored detailed feedback for RL/SFT training iterations.

2022 - Present

Machine Learning Data Quality Specialist - Metropolis Data Labs

Text

Audited labeled datasets used to train and validate NLP and recommendation models, focusing on label quality and error patterns. Performed error analysis to detect mislabeled, duplicated, or low-signal data and worked with engineering and product teams to refine annotation guidelines. Used Python and SQL to analyze datasets and generate quality reports that standardized evaluation workflows across teams.• Audited labeled datasets supporting NLP and recommendation model training and validation.• Performed error analysis to identify mislabeled, duplicated, or low-signal records.• Collaborated with engineers and product teams to improve annotation guidelines.• Standardized evaluation workflows and reported dataset quality findings.

2020 - 2022

Education

New York Institute of Technology

Master of Science, Data Science and Artificial Intelligence

Master of Science

2018 - 2018

Work History

Atlas AI Services

AI Model Evaluation Analyst

New York

2022 - Present

Urban Digital Solutions

Systems & Data Analyst

New York

2018 - 2020