Gesture Recognition
ML Engine
A real-time gesture recognition system using MediaPipe body tracking (hands, face mesh, pose) combined with a custom-trained KNN model. Features incremental learning, data augmentation, and temporal dynamics analysis.
Real-time Hand + Body Gesture Recognizer
Hands + Face + Pose
Tracking points
KNN (k=3)
ML Model
Incremental
Learning
Completed
Status
Why I Built This
Approximately 250,000 people in Kurdistan and Iraq live with hearing or speech difficulties. Most communication tools for this community require expensive hardware or specialized gloves.
This project explores a software-only approach — using a standard webcam to recognize sign language gestures in real time, making accessibility tools available without any special hardware.
System Design
Tracking: MediaPipe Hands (21 landmarks × 2), Face Mesh (468 points), and Pose estimation (33 joints) — combined into a unified feature vector per frame.
Temporal Engine: 15-frame sliding window analyzes velocity, acceleration, trajectory, and periodicity of hand/nose movements for dynamic gesture detection.
ML Model: KNN (k=3, distance-weighted) with StandardScaler — trained incrementally with 3× augmentation (noise, drift, occlusion simulation).
Technical Deep Dive
Incremental Learning
The model trains on newly recorded gestures without retraining from scratch. Quality scoring and outlier filtering ensure bad samples don't degrade accuracy.
Confusion Analysis
Tracks commonly mixed-up gesture pairs after every training session. Cross-validation scores and a confusion matrix guide which gestures need more samples.
Multi-Strategy Augmentation
Each gesture sample is augmented 3× using noise injection, temporal drift simulation, and occlusion masking — expanding small datasets without real data collection.
Temporal Descriptor
Explore Related Projects
Face Track Pro uses the same CV/AI stack for automated attendance — real-time face recognition across full classrooms.