AI/ML Completed

Object Detection and Segmentation Pipeline

Built a complete computer vision pipeline for detecting and segmenting household products using YOLOv8, Faster R-CNN, and YOLOv8-Seg, including custom dataset creation, model training, and ONNX deployment.

Duration

Spring 2025

Role

Developer

Institution

NTNU

Status

Completed

Technologies Used

PythonPyTorchYOLOv8Faster R-CNNCVATRoboflowONNXGrad-CAMOpenCV

Overview

This AIS4002 Intelligent Machines Module 3 project developed an end-to-end computer vision pipeline for detecting and segmenting custom household objects (toothpaste, toothbrush, Nivea cream, L'Oréal Men shower gel, mug, Milo, Nescafé). The project included creating a 143-image custom dataset with polygon annotations, training both one-stage (YOLOv8) and two-stage (Faster R-CNN) detectors, instance segmentation with YOLOv8-Seg, and model interpretation with Grad-CAM visualizations.

Problem Statement

Develop a custom object detection and segmentation system for objects not present in standard benchmark datasets (COCO, LVIS). Compare different detection architectures and deploy the best model in a hardware-agnostic format.

Challenges & Solutions

Challenge	Solution	Outcome
Small Custom Dataset	Used transfer learning from pre-trained models and careful data augmentation	Achieved good detection performance with only 143 images
Speed vs Accuracy Trade-off	Compared YOLOv8 (one-stage) vs Faster R-CNN (two-stage)	YOLOv8 achieved 6× faster inference with minor precision trade-off
Annotation Quality	Used CVAT for precise polygon masks with multiple verification modes	High-quality annotations for both detection and segmentation

Progress

✓ Dataset collection (143 images)

✓ Bounding box annotation with LabelImg

✓ Polygon annotation with CVAT

✓ YOLOv8 detection training and evaluation

✓ Faster R-CNN training and evaluation

✓ YOLOv8-Seg instance segmentation

✓ ONNX model conversion for deployment

✓ Grad-CAM visualization and interpretation

All Projects