LLM Dialogue Designer & Evaluator (AI Projects)
As an LLM Dialogue Designer & Evaluator, I created and annotated multilingual datasets modeling reasoning, intent, and safety alignment. My responsibilities included evaluating AI responses for factual accuracy, risk mitigation, and reasoning quality in Korean and English tasks. I also performed systematic bias detection, hallucination review, and step-by-step solution validation as part of model safety improvement efforts. • Achieved consistent annotation accuracy of 99.54% and selected for quality control leadership. • Focused on AI alignment for regulatory readiness, operational trust, and risk mitigations within the aviation domain. • Collaborated with global engineering teams to deliver structured feedback for model enhancement. • Utilized internal/proprietary evaluation platforms tailored for LLM assessment and QA.