6D object pose estimation : literature review and model-free mask generation pipeline

Master Thesis
Author
Vaggelis, Orestis
Βαγγέλης, Ορέστης
Date
2025-09View/ Open
Abstract
This thesis presents a three-part investigation into 6D object pose estimation for novel objects. The first two parts consist of a comprehensive literature review and a unified evaluation of state-of-the-art methods on benchmark datasets. This analysis identifies a critical performance bottleneck for model-free approaches: the lack of robust and accurate initial object segmentation. Motivated by this finding, the third and principal contribution of this work is the development of DiPose, a novel pipeline focused specifically on generating high-quality segmentation masks for model-free pose estimation. DiPose models a novel object by first performing a Structure-from-Motion (SfM) reconstruction from a brief onboarding video. The resulting point cloud is then used to learn a high-fidelity implicit representation via Fast Dipole Sums (FDS). This implicit model acts as a virtual CAD model, enabling the generation of synthetic 2D views that drive a foundation model-based framework to produce precise segmentation masks for test images.The proposed pipeline is validated on the HOPE dataset, where it outperforms a strong model- free baseline by 8 % in average precision.