Object Detection and Segmentation Pipeline
Built a complete computer vision pipeline for detecting and segmenting household products using YOLOv8, Faster R-CNN, and YOLOv8-Seg, including custom dataset creation, model training, and ONNX deployment.
Duration
Spring 2025
Role
Developer
Institution
NTNU
Status
Completed
Technologies Used
Overview
This AIS4002 Intelligent Machines Module 3 project developed an end-to-end computer vision pipeline for detecting and segmenting custom household objects (toothpaste, toothbrush, Nivea cream, L'Oréal Men shower gel, mug, Milo, Nescafé). The project included creating a 143-image custom dataset with polygon annotations, training both one-stage (YOLOv8) and two-stage (Faster R-CNN) detectors, instance segmentation with YOLOv8-Seg, and model interpretation with Grad-CAM visualizations.
Problem Statement
Develop a custom object detection and segmentation system for objects not present in standard benchmark datasets (COCO, LVIS). Compare different detection architectures and deploy the best model in a hardware-agnostic format.
Challenges & Solutions
| Challenge | Solution | Outcome |
|---|---|---|
| Small Custom Dataset | Used transfer learning from pre-trained models and careful data augmentation | Achieved good detection performance with only 143 images |
| Speed vs Accuracy Trade-off | Compared YOLOv8 (one-stage) vs Faster R-CNN (two-stage) | YOLOv8 achieved 6× faster inference with minor precision trade-off |
| Annotation Quality | Used CVAT for precise polygon masks with multiple verification modes | High-quality annotations for both detection and segmentation |