Skip to main content
Back to Projects
AI/ML Completed

Object Detection and Segmentation Pipeline

Built a complete computer vision pipeline for detecting and segmenting household products using YOLOv8, Faster R-CNN, and YOLOv8-Seg, including custom dataset creation, model training, and ONNX deployment.

Duration

Spring 2025

Role

Developer

Institution

NTNU

Status

Completed

Technologies Used

PythonPyTorchYOLOv8Faster R-CNNCVATRoboflowONNXGrad-CAMOpenCV

Overview

This AIS4002 Intelligent Machines Module 3 project developed an end-to-end computer vision pipeline for detecting and segmenting custom household objects (toothpaste, toothbrush, Nivea cream, L'Oréal Men shower gel, mug, Milo, Nescafé). The project included creating a 143-image custom dataset with polygon annotations, training both one-stage (YOLOv8) and two-stage (Faster R-CNN) detectors, instance segmentation with YOLOv8-Seg, and model interpretation with Grad-CAM visualizations.

Problem Statement

Develop a custom object detection and segmentation system for objects not present in standard benchmark datasets (COCO, LVIS). Compare different detection architectures and deploy the best model in a hardware-agnostic format.

Challenges & Solutions

Challenge Solution Outcome
Small Custom Dataset Used transfer learning from pre-trained models and careful data augmentation Achieved good detection performance with only 143 images
Speed vs Accuracy Trade-off Compared YOLOv8 (one-stage) vs Faster R-CNN (two-stage) YOLOv8 achieved 6× faster inference with minor precision trade-off
Annotation Quality Used CVAT for precise polygon masks with multiple verification modes High-quality annotations for both detection and segmentation

Progress

Dataset collection (143 images)
Bounding box annotation with LabelImg
Polygon annotation with CVAT
YOLOv8 detection training and evaluation
Faster R-CNN training and evaluation
YOLOv8-Seg instance segmentation
ONNX model conversion for deployment
Grad-CAM visualization and interpretation