1. Introduction
Our perception and analysis of the world around us have become mostly digital. Many scientific disciplines clearly benefit from this digital revolution. Scientists have created a wide variety of digital sensors that enable them to report on their research subjects at various time and length scales. This has led them to generate extensive quantitative and visual information, and the burden has now been placed into extracting knowledge from this information. In the case of 3D visual data, this amounts to identifying the shapes they contain. Computational geometry, computer vision, and computer graphics all face the challenge of developing effective algorithms to define, quantify, and compare those shapes. The advancements in machine learning and computational geometry have been extremely beneficial to those algorithms. This paper provides another source of evidence in support of enhancing such algorithms. We demonstrate that we can generate a potentially partial mapping between 3D shapes using statistical physics approaches. In our method, the cost of the correspondence acts as a gauge of the shapes’ similarity. We demonstrate the efficacy of this strategy on both simulated and actual anatomical data.
Methods that compare shapes fall under the general framework of morphometrics, the study of form, a concept that includes size and shape. While morphometrics is most often associated with biology and natural sciences, its techniques apply to any shape matching problems. Two types of such techniques can usually be identified, those based on global measures of the forms and those based on computing correspondences, or maps between the shapes. We review these two approaches briefly below.
Traditionally, morphometrics rely on measurements of lengths, widths, areas, masses, ratios, and/or angles that are then compared to assess the similarities between shapes [
1]. A significant drawback of such an approach is that the measurements are usually correlated and therefore include a significant amount of redundancy. Recent improvements within this approach include computation of geometric moments or Zernike moments over the shape [
2,
3,
4,
5], or of spherical harmonics over its surface [
6]. All those methods are based on global properties of the forms under study, thereby preventing partial matching between such forms.
An alternate approach to shape comparison is to first build a correspondence between the shapes. Finding correspondences, or maps between shapes, is a common problem in geometric processing with a wide range of applications (for reviews, see, for example, [
7,
8,
9,
10]). Here, we briefly discuss correspondence methods that are pertinent to our study. A promising method for such shape correspondence is to view the shapes as metric spaces. Two shapes are then equal if they are isometric. Otherwise, a map is built between the two shapes that is as isometric as possible, such that the difference between the two shapes is measured as the distance between the map and the isometry. For discrete shapes equipped with discrete metrics, the difference with an isometry is measured by computing the distortion in distances between pairs of points on the surface of the shapes, where the distance can be Euclidean or geometric. This idea has lead to the concepts of Gromov–Hausdorff and Gromov–Wasserstein distances between shapes (see, for example, [
11,
12,
13,
14,
15,
16,
17] in the context of shape comparison). Unfortunately, computing such distance amounts to solving a quadratic assignment problem, which is NP-hard. Despite recent progress (see, for example, [
18] for computing the Gromov–Wasserstein distance), efforts have focused on alternate approaches to finding shape correspondence. The first such approach proceeds by mapping the shapes into a common parameterizing metric domain, so that it is possible to directly compute a distance between points on different shapes. Methods in this category usually proceed in three steps. They first define a set of well-chosen landmarks or keypoints on the surfaces of the shapes, then assign “signatures” to those keypoints (i.e., their coordinates in the metric parameterizing domain), and finally determine a correspondence between these points, using the similarities of their signatures. Such a strategy has become standard for comparing 2D images. Methods such as SIFT [
19], SURF [
20] or ORB [
21] are commonly used to detect keypoints within 2D images and assign them signatures. Those keypoints are then matched using techniques such as RANSAC or the iterative closest point (ICP) algorithm. The problem of keypoint detection and signature assignment is harder for 3D objects. Many methods for assigning a signature to a keypoint have been proposed [
22], such as those that extrapolate the 2D signatures for images by building multiscale representations of the neighborhood of the keypoint [
23,
24,
25,
26], those that rely on the properties of the Laplace–Beltrami operator defined on the surface of interest [
27,
28,
29], and those that rely on conformal mapping to a standard domain [
14,
30,
31]. Matching the points based on their signatures is performed using the concept of “bag-of-features” to search for similar shapes, following the idea of Google search [
32], the concept of shape distributions [
33], by directly comparing top correspondences between shapes [
29], or by using optimal transport [
14,
31,
34,
35,
36]. The second approach consists of relaxing the requirements that the correspondence be point-wise by considering soft correspondence [
16,
37,
38,
39]. Finally, we briefly mention the data-driven approaches that take advantage of modern machine learning frameworks. We refer readers interested in those techniques to recent papers and surveys [
22,
40,
41,
42,
43,
44,
45].
Our aim in this paper is to provide an alternate framework for shape comparison that falls under the shape correspondence category, but that uses methods derived from statistical physics to measure the similarities between shapes, with a special focus on partial matching (see
Figure 1 for an overview). We consider shapes that are defined by their surface, usually represented by a triangulated mesh characterized with vertices, edges, and triangles. We consider all vertices in a mesh as keypoints. We test two types of signatures for the vertices, one based on a representation of their neighborhood, and one based on the properties of the Laplace Beltrami operator for the mesh. The former is based on the idea of scale-invariant spin images adapted to triangular meshes, the LD-SIFT signatures [
46]. The latter is based on the idea of solving the Shrödinger equation on the surface to characterize how waves travel on this surface and therefore capture its geometry. The corresponding signatures are referred to as wave kernel (WK) signatures [
29]. The mapping between the vertices is generated from the transport plan that solves either the optimal balanced transport (OT) problem or the optimal unbalanced transport (UOT) problem. We consider the unbalanced versions of the OT problem as it is expected to solve partial matching problems. We use a statistical physics approach to solve these OT problems [
47,
48]. The cost associated with the optimal plan defines a distance between the two shapes. This paper does not stand on its own. The concepts of spin images and WK signatures for meshes have been proposed before. The idea of using optimal transport to compute correspondences between points describing shapes has been described in detail in the pioneering work of [
14], and applied in different settings [
31,
34,
35]. The novelty of this paper is to integrate those different components into a global physics-based approach for solving the full and partial shape registration problems. Our report should not be expected to be exhaustive: we limit ourselves to two shape signatures and two optimal transport techniques, but provide in-depth analyses of their strengths and weaknesses.
We are well aware that with the increase of computing power and the number of shape datasets available, deep learning techniques dominate the domain of shape comparisons (see, for example, [
49,
50,
51,
52]). Applications of deep learning, however, are contingent on the access to large datasets of shapes that are relevant to the shapes under study. It is not our intent to compete with such approaches. Instead, we focus on a physics-based approach that provides an alternate framework for solving partial shape matching. Ultimately, our formalism should prove useful for developing better loss functions for machine learning.
The paper is organized as follows. In the next section, we introduce the different elements of our framework, namely, signatures of vertices on surfaces and unbalanced optimal transport. In the results section, we compare the two types of signatures considered, as well as the two types of OT solutions proposed for registering those signatures, on nonrigid full and partial 3D matching examples, as well as on anatomical datasets. For the full shape matching, we provide comparisons with other methods based on the SHREC19 benchmark [
53]. The following section includes a discussion on the differences between the two signatures we have considered, as well as on the differences between the two OT frameworks. The summary and conclusion section highlights possible future developments.
6. Summary and Conclusions
In this paper, we revisited the important problem of nonrigid 3D shape comparison for shapes represented by triangular meshes. We followed the standard framework of first computing signatures (i.e., feature vectors) for the vertices of the meshes to be compared, and then to find an optimal correspondence between those vertices that minimizes a cost matrix computed from the signatures. Our framework differs from other similar frameworks, however, in that we replaced the standard ICP procedure used to find this correspondence with a more elaborate optimal transport strategy. Such a strategy is usually deemed to be too computationally expensive. We rely on our own physics-based approach to solving the optimal transport problem as a means to circumvent this problem. This physics-based approach uses an approximation, much akin to the entropy-regularized OT method that has become popular [
72]. In contrast with entropy regularization, we have convergence guarantee for our approach as well as established stability and robustness properties that enable us to use our OT solvers routinely on large systems, with confidence in their ability to generate the actual optimal correspondence. We described how we can approach balanced and unbalanced (i.e., partial) optimal transport problems with our framework, which then translate into complete shape and partial shape comparison solutions.
To find a meaningful correspondence between the vertices of two shapes to be compared requires a good estimate of the cost of associating a vertex to another. In our framework, this cost is based on computing the difference between the 3D signatures of those vertices. We used two different types of signatures, the LD-SIFT signature, which is based on the concept of shape context, i.e., an image that renders the mesh onto the tangent plane to the vertex considered, characterized with its 2D SIFT signature at the center of the image, and the WKS signature, which is based on solving the Shrödinger equation on the surface of the shape. We showed that the latter performs better for whole shape comparison in the presence of nonrigid deformation. This was attributed to the fact that WKS signatures are mostly intrinsic, while LD-SIFT are extrinsic. In contrast, however, we found that LD-SIFT signatures perform well for partial shape comparison, as WKS signatures are more global as they capture properties of the whole mesh. Different signatures can, and need to be, tested within our framework. This is currently under study.
Our implementations of the OT methods were found to be efficient, with nearly optimal use of parallelization, both on CPU and on GPU processors. We acknowledge, however, that there is room and need for improvement. The space complexity of our implementations is , as we need to store both the cost matrix and at least one work array of similar size. Those matrices are of size . Such a requirement limits the use of our implementations to problems of size up to a few , which falls short of the number of vertices observed in actual meshes generated by modern 3D scanners. Handling such large systems will require some redesign of our algorithms and/or the design of efficient methods for selecting a subset of vertices that are representatives of the shapes considered. This is an active area of research, which we will explore in future studies.