What is iou?
IoU stands for Intersection over Union. It is a simple numerical score that tells you how much two shapes (usually rectangles or masks) overlap with each other. The score ranges from 0 (no overlap) to 1 (perfect overlap).
Let's break it down
First, draw the two shapes you want to compare - for example, a box predicted by a model and the true box drawn by a human.
- Intersection: Find the area where the two shapes cover the same space.
- Union: Add together the total area covered by both shapes, but count the overlapping part only once.
- IoU: Divide the intersection area by the union area (IoU = Intersection ÷ Union). The result is a single number that summarizes the overlap.
Why does it matter?
IoU gives a quick, objective way to judge how accurate a computer‑vision model is at locating objects. If the IoU is high, the model’s prediction is close to the real object. Researchers and engineers use it to compare different models, set performance targets, and decide whether a detection is “good enough” for a particular application.
Where is it used?
- Object detection (e.g., detecting cars, people, or animals in photos)
- Image segmentation (pixel‑level labeling of objects)
- Video tracking (checking if a tracked box stays on the same object)
- Robotics and autonomous driving (verifying that perceived obstacles match reality)
- Any system that needs to measure spatial agreement between predicted and true regions.
Good things about it
- Simple: Only a few arithmetic steps are needed.
- Intuitive: A higher number directly means better overlap.
- Scale‑independent: Works the same for small and large objects.
- Widely adopted: Most benchmark datasets and papers use IoU, making results easy to compare.
- Differentiable: Variants can be used in training deep‑learning models.
Not-so-good things
- Sensitive to size: Small objects can get a low IoU even with a tiny misalignment, penalizing models harshly.
- Ignores shape details: Two boxes with the same IoU can have very different visual alignments.
- Threshold choice: Deciding what IoU value counts as a “correct” detection (e.g., 0.5 vs 0.75) can be arbitrary and affect reported performance.
- Not perfect for non‑rectangular shapes: For complex masks, IoU may not capture subtle boundary errors.
- Can be misleading: A high IoU does not guarantee that the predicted object is semantically correct (e.g., overlapping the wrong object class).