For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Ahmed Hmada

Ahmed Hmada

Experienced data evaluator with strong expertise in LLM evaluation and rubr

EGYPT flag
Alexandria , Egypt
$15.00/hrExpertLabelboxScale AI

Key Skills

Software

LabelboxLabelbox
Scale AIScale AI

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
ImageImage
TextText

Top Task Types

Audio Recording
Evaluation Rating
Prompt Response Writing SFT
Text Generation
Text Summarization

Freelancer Overview

I’m a professional data labeling and AI training specialist with hands-on experience in LLM evaluation and rubric-based analysis. Through my work on various evaluation and annotation projects, I’ve developed a deep understanding of how to assess AI-generated responses across multiple dimensions—such as instruction following, truthfulness, tone, and structure—using clear and consistent rubrics. I’ve worked on several platforms that allowed me to refine my analytical and linguistic skills, especially in evaluating Arabic content, both Modern Standard and Egyptian dialect. I also contributed to Arabic voice and speech projects that helped enhance natural language understanding for AI systems. My goal is always to deliver precise, high-quality annotations that truly support the development of smarter, more reliable AI models.

ExpertEnglish

Labeling Experience

Scale AI

Cypher Evals – LLM Evaluation & Rubric-Based Text Assessment

Scale AITextClassificationText Generation
Participated in the Cypher Evals project focused on evaluating large language model (LLM) outputs using rubric-based frameworks. My tasks involved analyzing prompts and comparing AI-generated responses across multiple dimensions, including instruction following, truthfulness, response length, structure, tone, and harmlessness. Each task required identifying key points within the rubric, applying consistent scoring, and writing clear justifications supported by textual evidence. The project enhanced my expertise in linguistic analysis for both Modern Standard Arabic and Egyptian dialects, as well as my ability to deliver accurate, high-quality evaluations. Worked with hundreds of prompt–response pairs, ensuring reliable assessment and detailed reporting to improve model alignment and fine-tuning accuracy.

Participated in the Cypher Evals project focused on evaluating large language model (LLM) outputs using rubric-based frameworks. My tasks involved analyzing prompts and comparing AI-generated responses across multiple dimensions, including instruction following, truthfulness, response length, structure, tone, and harmlessness. Each task required identifying key points within the rubric, applying consistent scoring, and writing clear justifications supported by textual evidence. The project enhanced my expertise in linguistic analysis for both Modern Standard Arabic and Egyptian dialects, as well as my ability to deliver accurate, high-quality evaluations. Worked with hundreds of prompt–response pairs, ensuring reliable assessment and detailed reporting to improve model alignment and fine-tuning accuracy.

2025
Scale AI

Grassland – Arabic Speech Quality Review & Audio Annotation

Scale AIAudioClassificationAudio Recording
Worked as a Senior Reviewer on the Zilphon Grass project, which focused on improving Arabic speech recognition datasets. My main responsibilities included reviewing and validating generated audio samples to determine whether they were usable (valid) or flagged as SBQ (Should Be Questioned). I supervised the quality of submissions from annotators (Attemptors), provided detailed feedback, and ensured all data met the project’s acoustic and linguistic standards. This role required strong attention to detail, familiarity with Arabic phonetics and dialects, and the ability to make consistent, high-accuracy judgments under standardized guidelines. The project enhanced my skills in audio evaluation, data validation, and quality assurance for AI-based voice recognition systems.

Worked as a Senior Reviewer on the Zilphon Grass project, which focused on improving Arabic speech recognition datasets. My main responsibilities included reviewing and validating generated audio samples to determine whether they were usable (valid) or flagged as SBQ (Should Be Questioned). I supervised the quality of submissions from annotators (Attemptors), provided detailed feedback, and ensured all data met the project’s acoustic and linguistic standards. This role required strong attention to detail, familiarity with Arabic phonetics and dialects, and the ability to make consistent, high-accuracy judgments under standardized guidelines. The project enhanced my skills in audio evaluation, data validation, and quality assurance for AI-based voice recognition systems.

2024 - 2025

Education

No Education added yet

Ahmed H. hasn’t added any Education History to their OpenTrain profile yet.

Work History

S

Scale AI

Voice Over Artist

N/A
2022 - Present
A

Alignerr

Creative Writer

N/A
2021 - Present