ESP32-CAM RC Car — Hand-Gesture Controlled (Team Project)
5/15/20251 min read
With a teammate, we built one gesture-recognition app that could control two different devices: my wireless RC car and his bionic hand. I focused on the car: live video from an ESP32-CAM, servo steering + motor control, and a Python app that translated MediaPipe hand gestures into HTTP commands. It was fast, surprisingly reliable, and a nice proof that one software pipeline can drive multiple hardware targets.
Team & Goal
Team: 2 people
Goal: Use one gesture-recognition app to control two separate hardware systems, an RC car and a 10-servo bionic hand—showing the approach scales across devices.
My Role
Build the car
ESP32-CAM firmware (video streaming + motor/servo control endpoints)
Python control app & UI, networking and test tooling
Worked with my teammate to keep the gesture → action schema consistent for both devices
How it works
Webcam → Python app (MediaPipe Hands) counts visible fingers
Gesture map: e.g., Fist = stop, One finger = forward, Two = reverse, Three/Four = left/right, Open palm = center/neutral
App sends HTTP commands to the ESP32-CAM (/control?dir=w/a/s/d/x/c)
Tkinter UI shows two live panels: the ESP32 video stream and the gesture overlay
Small “freeze window” after each command to avoid rapid, repeated inputs
Car_software.py used MediaPipe, OpenCV, Tkinter, and requests; it rotated/ flipped frames for correct orientation and debounced commands with a short cooldown handy for stability.
Hardware & Stack
ESP32-CAM (camera + Wi-Fi), motor driver, micro-servo for steering
Chassis: simple, robust build (3D-printed mounts, TT motors)
Firmware: tiny HTTP handler for drive/steer commands + MJPEG stream
Control app: Python (OpenCV, MediaPipe, Tkinter UI)
Challenges → Solutions
Noisy gesture detection: Added debounce + cooldown and mapped only distinct finger-count changes to commands.
Network hiccups: On failed frames, the UI kept the last good image, so the operator wasn’t “flying blind.”
Human factors: Tuned steering angles and motor PWM so it felt smooth, not twitchy.
Teamwork Wins
We agreed on a shared gesture schema and kept a simple API contract so the same app could drive both devices.
Paired testing: while I tuned car control, my teammate matched servo motions on the hand great practice in versioning, quick feedback, and clear commit messages.
Outcome
Reliable demo: smooth live video + responsive gesture control
Reusability: the same app controlled two hardware systems with zero code forks
What I’d improve: tracked-gesture “trails” for speed control, obstacle sensors, and a safer “arm/disarm” state machine