// Case Study / 2025

Gesture Recognition
ML Engine

A real-time gesture recognition system using MediaPipe body tracking (hands, face mesh, pose) combined with a custom-trained KNN model. Features incremental learning, data augmentation, and temporal dynamics analysis.

PythonMediaPipeOpenCVKNN Classifierscikit-learnTkinter / ttkbootstrapNumPyPickle
Gesture Recognition Engine — Desktop App

Real-time Hand + Body Gesture Recognizer

Hands + Face + Pose

Tracking points

KNN (k=3)

ML Model

Incremental

Learning

Completed

Status

01 — The Problem

Why I Built This

Approximately 250,000 people in Kurdistan and Iraq live with hearing or speech difficulties. Most communication tools for this community require expensive hardware or specialized gloves.

This project explores a software-only approach — using a standard webcam to recognize sign language gestures in real time, making accessibility tools available without any special hardware.

02 — Architecture

System Design

Tracking: MediaPipe Hands (21 landmarks × 2), Face Mesh (468 points), and Pose estimation (33 joints) — combined into a unified feature vector per frame.

Temporal Engine: 15-frame sliding window analyzes velocity, acceleration, trajectory, and periodicity of hand/nose movements for dynamic gesture detection.

ML Model: KNN (k=3, distance-weighted) with StandardScaler — trained incrementally with 3× augmentation (noise, drift, occlusion simulation).

03 — Engineering Highlights

Technical Deep Dive

Incremental Learning

The model trains on newly recorded gestures without retraining from scratch. Quality scoring and outlier filtering ensure bad samples don't degrade accuracy.

Confusion Analysis

Tracks commonly mixed-up gesture pairs after every training session. Cross-validation scores and a confusion matrix guide which gestures need more samples.

Multi-Strategy Augmentation

Each gesture sample is augmented 3× using noise injection, temporal drift simulation, and occlusion masking — expanding small datasets without real data collection.

04 — Feature Vector

Temporal Descriptor

Velocity L/R
Acceleration L/R
Nose Velocity
Trajectory L/R
Periodicity
Δt Stats

Explore Related Projects

Face Track Pro uses the same CV/AI stack for automated attendance — real-time face recognition across full classrooms.