Abud Jembe - Senior Software Engineer - AI and Backend Systems

Key Skills

Software

Appen

SuperAnnotate

Labelbox

Scale AI

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Document

Top Task Types

Text Summarization

Evaluation/Rating

Data Collection

Prompt + Response Writing (SFT)

Computer Programming/Coding

RLHF

Red Teaming

Classification

Freelancer Overview

I am a senior software engineer with extensive hands-on experience in building and optimizing AI-driven systems, with a strong focus on data processing, annotation, and training data pipelines. My background includes developing retrieval-augmented generation (RAG) systems for improved information precision, designing computer vision and OCR pipelines to automate document processing, and engineering supervised learning models for predictive analytics. I have worked with a variety of data types—including text, images, and structured datasets—and have led the integration of ML and NLP modules in production environments with high reliability and uptime. My technical skills span Python, PyTorch, Tesseract, Pandas, NumPy, Apache Spark, and cloud-based MLOps tools (GCP, AWS, Docker), allowing me to streamline data labeling workflows, enhance data quality, and accelerate AI model deployment. I am passionate about building robust data pipelines that power accurate, scalable AI solutions across domains like business intelligence, automation, and document analysis.

ExpertSwahiliEnglish

Labeling Experience

AI Training

SuperannotateTextComputer Programming CodingData Collection

SuperAnnotate AI Training:PROJECT DESCRIPTION Computer Vision Annotation Specialist - Autonomous Vehicle Dataset SuperAnnotate | Remote | March 2023 – November 2024 Project Overview: Contributed to developing training datasets for autonomous vehicle perception systems by annotating high-resolution images and video frames with precise object detection, semantic segmentation, and classification labels. The project involved creating accurate bounding boxes, polygons, and keypoint annotations for various road objects, pedestrians, vehicles, and traffic infrastructure to train computer vision models for self-driving technology. Core Responsibilities: Object Detection & Annotation: Annotated 300-500 images daily with bounding boxes for vehicles, pedestrians, cyclists, and road objects Created precise polygon annotations for irregular objects including road signs, traffic lights, and barriers -Applied semantic se

2022 - 2025

Intent Classification & RLHF Labeling for Conversational AI

LabelboxTextRLHFEvaluation Rating

As Senior Software Engineer in AI/ML at XYONIX, led hands-on oversight of intent classification and response quality labeling for conversational AI systems. Developed and evaluated AI-generated outputs by curating, annotating, and validating large-scale textual datasets for production retrieval-augmented generation (RAG) pipelines. Collaborated with QA and product teams to ensure high annotation quality and robust RLHF workflows for enterprise-grade AI deployments. • Managed RLHF workflows and evaluation of generated responses. • Labeled user intents for conversational AI and automated assistant platforms. • Built and refined high-quality NLP training datasets for LLMs. • Validated and rated data to optimize model grounding and real-world reliability.

2024 - 2024

OCR & Document Data Labeling Lead

Scale AIDocumentComputer Programming CodingData Collection

Led labeling, extraction, and annotation of document images and OCR pipelines at AlanAI, dramatically reducing manual review burden. Designed data labeling workflows for large-scale document datasets, supporting downstream multimodal AI training and model benchmarking. Oversaw quality assurance and accuracy validation for OCR-labeled datasets and integration for production ML inference. • Built and labeled training data for OCR and document extraction. • Annotated scanned documents and unstructured images for automated review. • Validated annotated data to ensure accuracy in document classification. • Developed guidelines and process templates for document labeling pipeline.

2022 - 2024

NLP Dataset Labeling & Evaluation Lead

AppenTextClassification

At AlanAI, directed the labeling and evaluation of NLP training sets for voice automation modules. Facilitated rigorous annotation of intent detection and text classification datasets, directly supporting gains in AI accuracy and model evaluation. Oversaw dataset curation strategies to standardize quality and reproducibility for production ML workflows. • Rigorous text annotation and intent labeling for supervised and lead-scoring models. • Curated and evaluated labeled data to improve intent recognition by 22%. • Implemented annotation guidelines for fast, high-quality dataset production. • Coordinated dataset QA cycles for robust model evaluation and training.

2022 - 2024

Data Annotation

AppenImageText SummarizationEvaluation Rating

Project Overview: Contributed to improving search engine quality and relevance by evaluating search results against user intent and query expectations. The project involved assessing the quality, relevance, and usefulness of web search results, advertisements, and featured content to train and refine search algorithms for a major global search engine.

2021 - 2024

Education

N

Northeastern University

Master of Science, Computer Science

Master of Science

2017 - 2018

B

Boston College

Bachelor of Technology, Computer Science

Bachelor of Technology

2012 - 2016

Work History

X

Xyonix

Senior Software Engineer

Remote

2024 - 2025

A

AlanAI

Senior Software Engineer

Remote

2022 - 2024