Generation and Optimization of Local Shape Descriptors for Point Matching in 3-D Surfaces

Summary


The goal of object recognition is to identify and localize objects of interest in an image. We formulate Local Shape Descriptor selection for model-based object recognition in range data as an optimization problem and offer a platform that facilitates a solution. Recognition is often performed in three phases: point matching, where correspondences are established between points on the 3-D surfaces of the models and the range image; hypothesis generation, where rough alignments are found between the image and the visible models; and pose refinement, where the accuracy of the initial alignments is improved. The overall efficiency and reliability of a recognition system is highly influenced by the effectiveness of the point matching phase. Local Shape Descriptors are used for establishing point correspondences by way of encapsulating local shape, such that similarity between two descriptors indicates geometric similarity between their respective neighbourhoods. We present a generalized platform for constructing local shape descriptors that subsumes a large class of existing methods and allows for tuning descriptors to the geometry of specific models and to sensor characteristics. Our descriptors, termed as Variable- Dimensional Local Shape Descriptors, are constructed as multivariate observations of several local properties and are represented as histograms. The optimal set of properties, which maximizes the performance of a recognition system, dependa on the geometry of the objects of interest and the noise characteristics of range image acquisition devices and is selected through pre-processing the models and sample training images. Experimental analysis confirms the superiority of optimized descriptors over generic ones in recognition tasks in LIDAR and dense stereo range images.

General Overview

Model-based object recognition in range data involves the detection and localization of 3-D models in range images. Given a set of 3-D models and a range image, detection is defined as identifying the visible models, and localization is defined as finding the 3-D rigid transformations that align the visible models with the image. A 3-D rigid transformation has three positional and three rotational degrees of freedom and exhaustive search through this 6-D pose space is infeasible. A large class of techniques aim at efficiently solving this problem without requiring exhaustive search by following a three-phase scheme consisting of point matching, hypothesis generation, and pose refinement.

In the first phase, tentative matches are established between several points on the image and their corresponding points on models by comparing the local shapes of various regions of the two data sets. Since the output of the first phase often contains some incorrect matches (i.e. outliers), a statistically robust algorithm such as RANSAC [32] or the Generalized Hough Transform [9], is then utilized in the second phase to generate and verify rigid transformations that align visible models with the image. Finally, in the third phase, the accuracy of the recovered alignments are improved using a refinement algorithm. Figure 1.2 illustrates the block diagram of these three phases.


Results








Data Set


Some useful Matlab utilities for working with 3D point clouds:

Click3DPoint: for selecting a 3D point from a point cloud by clicking on it. This function is useful for example in generating ground truth alignments between models and scenes. You could used it to find the index of corresponding points in two point clouds and use a few (3+) point matches to compute a least squares estimation of the rigid transformation that aligns the two point clouds.

ICP: Matlab implementation of the Iterative Closest Point Algorithm by A. Mian. This could be used to refine the initial alignment found using least squares and manually matched points.


Lidar Point Clouds 3D point clouds of the five models in various placements taken with a VIVID 3D scanner. The first line of the PTS files shows the number of points in each scene and subsequent lines contain the coordinates of each surface point ([x y z]). The model-scene alignment is provided for most model-scene pairs. An alignment file (e.g. im1_Angel.txt) contains a 4x4 homogeneous transform matrix that aligns a model with the scene.
Stereo point Clouds 3D point clouds and model-scene alignment files are presented in the same format as the LIDAR point clouds. An intensity image (.tif) of each setup is also included. The 3D point clouds are generated from disparity maps from stereo image pairs (using the point grey software with SSD).
Models Five models (Angel, Big Bird, Gnome, Kid, Zoe) in oriented point cloud (.ops) format. The first line of the OPS files contains the number of points in the model and each line after that contains the coordinates and the normal direction of a surface point ([x y z nx ny nz]). The order of the points has no significance.

If you use this data set in your research, please cite the following:
[1]


Babak Taati, Michel Bondy, Piotr Jasbedzki, and Michael Greenspan, "Variable Dimensional Local Shape Descriptors for Object Recognition in Range Data", Proceedings of the International Conference on Computer Vision (ICCV 2007) - 3D Representation for Recognition (3dRR), Oct.2007
[2]http://rcvlab.ece.queensu.ca/~qridb/, “Queen’s Range Image and 3-D Model Database”, 2009.

For more Models please see the Queen's Range Image Data Base.

Publications

[1]


Babak Taati and Michael Greenspan, "Satellite Pose Acquisition and Tracking with Variable Dimensional Local Shape Descriptors", Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS 2008) - Robot Vision for Space Application (RVSA), pp4-9 Aug. 2005
[2]


Babak Taati, Michel Bondy, Piotr Jasiobedzki, and Michael Greenspan, "Automiatic Registration for Model Building using Variable Dimensional Local Shape Descriptors", Priceedings of the 6th International Conference on 3-D Digital Imaging and Modeling (3dIm'07),pp265-272, Aug. 2007
[3]


Babak Taati, Michel Bondy, Piotr Jasbedzki, and Michael Greenspan, "Variable Dimensional Local Shape Descriptors for Object Recognition in Range Data", Proceedings of the International Conference on Computer Vision (ICCV 2007) - 3D Representation for Recognition (3dRR), Oct.2007
[4]


Babak Taati, `Generation and Optimization of Local Shape Descriptors for Point Matching in 3-D Surfaces`, Ph.D. Thesis, Queen`s University, August 2009.

Copyright Queen's University 2009, 2010