Multi-View 3-D Object Retrieval with Incomplete Data


This demo describes a novel approach to multi-view 3-D object representation and recognition which can be used in image database retrieval applications. 3-D objects are recognised using a relatively small number of images taken from different views. An unknown object is then recognised by a single image taken from an arbitrary viewpoint. The system uses shape information only, and is intended for shape similarity retrieval.

A multi-scale edge-based segmentation algorithm is used to segment images. Input image edges include shadow boundaries as well as internal object edges and background edges. Furthermore, object boundary edges are usually broken. Each local segment is part of an edge contour having only two curvature zero crossing at its end-points. The multi-scale segmentation is carried out using the Curvature Scale Space technique which has been selected for MPEG-7 standardisation. The multi-scale segmentation algorithm accumulates segments from different scales as long as each new segment added is different from existing ones. Its advantage is that it is robust to contour noise and it does not risk losing any useful structure on the contour. Each segment is described by a number of features which are used to narrow down the search space during the recognition process. The segment features are based on the following:

In response to an input query, geometric hashing is first used to determine the number of matched segments between the input image and every model image. (A hash table is constructed for each of the segment features.) Those models with the larger number of matched segments are then passed to the verification stage. During verification, the transformation parameters obtained from each pair of the matched segments define a point in the parameter space. Clustering is then performed in the parameter space to identify a set of matched segments with similar transformation parameters. This allows the system to detect the best matching models.

The methods have been tested on a collection of 3-D objects consisting of 15 aircrafts of different shapes. The following image shows one view of each object in the database:

For the experiments we used an optimal view selection algorithm to select a number of characteristic views for each object (about 25 views per object). As for input queries, we prepared another video sequence from the same objects, using different illumination and backgrounds. From this video sequence, we randomly grabbed a small number of views for each object and used them as the input query to the system. In response to a query, the system returned n most similar views of the models where n was determined by the user. The following figure shows several examples. The top images are the input queries followed by the outputs from the system.

The system is quite robust to incomplete data as well as complex backgrounds. The following figure shows additional examples. Again the top images are the input queries followed by the outputs from the system.

This work was supported by EPSRC research grant # GR/L76754/01. Further information can be found in relevant publications (such as ICPR-2000).


F.Mokhtarian@ee.surrey.ac.uk
June 2001