DepthAI YuNet: Fast Face Detection on OAK Cameras
Hands-on review of DepthAI YuNet, an open-source project for running the lightweight YuNet face detection model on Luxonis OAK cameras via DepthAI.
Running a neural network on the edge can be a pain — especially with face detection, where you need both speed and accuracy. DepthAI YuNet solves that by combining the lightweight YuNet model with Luxonis' DepthAI hardware (OAK‑1, OAK‑D). This open‑source Python project runs real‑time face detection directly on the camera's Myriad X VPU, offloading processing from your main computer. Because inference stays on‑device, video data never leaves the camera — a privacy win for sensitive applications.
What is DepthAI YuNet?
DepthAI YuNet is a wrapper that runs the YuNet face detection model on DepthAI cameras. YuNet is a compact, efficient model from OpenCV that scores 0.834 on WIDER Face Easy. The project provides pre‑compiled models (blobs) for several resolutions, along with scripts to generate custom ones. It supports two modes: Host mode (postprocessing on the host CPU) and Edge mode (postprocessing on the device), giving developers flexibility to trade off CPU usage vs. latency.
How It Works: Host vs. Edge Mode
The architecture is straightforward:
- Two models: a YuNet ONNX model converted to OpenVINO IR then blob, and a separate postprocessing model for Edge mode that performs NMS and score filtering.
- Input resolution: blobs are fixed‑size (e.g., 180×320, 270×480). You choose a resolution that balances speed and accuracy.
| Resolution | FPS (OAK‑D, Host mode) | FPS (OAK‑D, Edge mode) |
|---|---|---|
| 180×320 | ~30 | ~35 |
| 270×480 | ~20 | ~25 |
| 360×640 | ~15 | ~18 |
- Host mode: DepthAI sends raw outputs via USB; the host CPU runs NMS. Simple to debug but uses CPU cycles.
- Edge mode: Postprocessing runs on the same VPU, reducing USB data transfer. Requires the second model blob. Best for latency‑sensitive applications.
- Model generation: a Docker container with PINTO's openvino2tensorflow tools converts ONNX to blob. The repo includes scripts to regenerate postprocessing models with custom thresholds.
Quick Start and Usage
First, install dependencies and clone the repo:
git clone https://github.com/geaxgx/depthai_yunet.git
cd depthai_yunet
python3 -m pip install -r requirements.txt
To run face detection using the internal camera in Host mode:
python3 demo.py
To use a video file instead:
python3 demo.py -i path/to/video.mp4
For Edge mode with lower latency:
python3 demo.py -e
Press keys in the OpenCV window to toggle bounding boxes, landmarks, scores, and FPS overlay.
Real‑world example
Detect faces in a recorded video and save the output:
python3 demo.py -i input_video.mp4 -o output_video.avi
For higher accuracy, switch to a larger model resolution:
python3 demo.py -i input_video.mp4 -mr 270x480 -o output.avi
For a real‑time application, use the internal camera and tweak --internal_fps and --model_resolution to balance FPS and detection quality. The default model (180×320) runs at ~30 FPS on OAK‑D.
Pros, Cons, and Alternatives
Pros
- Fast inference on the VPU, freeing up host CPU.
- Edge mode eliminates USB bandwidth bottlenecks.
- Pre‑built models for common resolutions; easy to generate custom ones.
- Clean Python API with OpenCV integration.
- Solid documentation and CLI help.
Cons
- Only works with DepthAI‑compatible hardware (OAK series).
- Fixed input resolutions; no dynamic sizing.
- Small community and maintenance (last commit Jan 2022).
- Postprocessing model generation requires extra tooling (Docker, PyTorch).
Alternatives
- OpenCV DNN with YuNet: Runs on CPU/GPU, no DepthAI needed. Slower on edge but works anywhere OpenCV runs.
- MediaPipe Face Detection: Google's cross‑platform solution, optimized for mobile/edge. No DepthAI dependency, but not VPU‑accelerated.
- Intel Distribution of OpenVINO: Run any model on Intel hardware (CPU, GPU, VPU). More flexible but steeper learning curve.
Verdict: Should You Use It?
If you own an OAK camera and need reliable, real‑time face detection without hogging your host CPU, DepthAI YuNet is a no‑brainer. It's a polished, minimal project that just works. Skip it if you don't have DepthAI hardware, need dynamic input sizes, or require active development — consider OpenCV or MediaPipe instead. For its target niche, it's a solid choice.