AI Model Evaluation Analyst - Atlas AI Services
Evaluated and scored AI-generated text responses across multiple customer-facing and knowledge domains using structured rubrics. Assessed correctness, completeness, tone, safety, and user-intent alignment while identifying failure modes such as hallucinations and ambiguous answers. Provided written feedback that was used in reinforcement learning and supervised fine-tuning cycles to improve production model reliability.• Reviewed AI-generated navigation, customer support, general knowledge, and task-execution responses.• Applied rubric-based scoring to rate quality and alignment criteria for each response.• Flagged edge cases impacting user trust, including hallucinations and ambiguous output.• Authored detailed feedback for RL/SFT training iterations.