Next Article in Journal / Special Issue
Conformal-Based Surface Morphing and Multi-Scale Representation
Previous Article in Journal / Special Issue
Deterministic Greedy Routing with Guaranteed Delivery in 3D Wireless Sensor Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Characteristic Number: Theory and Its Application to Shape Analysis

1
School of Software, Dalian University of Technology, Tuqiang St. 321, Dalian 116620, China
2
School of Mathematical Sciences, Dalian University of Technology, Linggong Rd. 2, Dalian 116023, China
*
Author to whom correspondence should be addressed.
Axioms 2014, 3(2), 202-221; https://doi.org/10.3390/axioms3020202
Submission received: 27 March 2014 / Revised: 28 April 2014 / Accepted: 28 April 2014 / Published: 15 May 2014

Abstract

:
Geometric invariants are important for shape recognition and matching. Existing invariants in projective geometry are typically defined on the limited number (e.g., five for the classical cross-ratio) of collinear planar points and also lack the ability to characterize the curve or surface underlying the given points. In this paper, we present a projective invariant named after the characteristic number of planar algebraic curves. The characteristic number in this work reveals an intrinsic property of an algebraic hypersurface or curve, which relies no more on the existence of the surface or curve as its planar version. The new definition also generalizes the cross-ratio by relaxing the collinearity and number of points for the cross-ratio. We employ the characteristic number to construct more informative shape descriptors that improve the performance of shape recognition, especially when severe affine and perspective deformations occur. In addition to the application to shape recognition, we incorporate the geometric constraints on facial feature points derived from the characteristic number into facial feature matching. The experiments show the improvements on accuracy and robustness to pose and view changes over the method with the collinearity and cross-ratio constraints.

1. Introduction

Projective geometry is of fundamental importance in computer vision and object recognition. It is a key mathematical tool for 3D reconstruction from multiple views [1]. Furthermore, there has been a long history for object recognition to use geometric invariants that reflect the geometries of an object under different transformation groups [2,3]. The cross-ratio on five coplanar points is a fundamental invariant under projective geometry. One may derive projective invariants from the cross-ratio for more points [4,5] and others for coplanar conics [6]. These invariants can be used to construct descriptors for shape recognition invariant to projective deformations [7,8]. Researchers also build robust constraints upon these invariants in order to match geometric primitives between images, such as points [5,9], lines [10,11] and closed contours [12,13]. In a recent work, Bryner et al. derive novel metrics on geometric invariants to affine and projective groups in a general Riemannian framework and develop shape analysis algorithms for both point sets and parametric curves [14]. In the context of facial analysis, Riccio and Dugelay devise features for recognition based on 2D/3D geometric invariants [15]. However, these invariants are typically defined on a limited number (e.g., five for the classical cross-ratio) of collinear planar points and also lack the ability to characterize the curve or surface underlying the given points.
On the other hand, curves and surfaces are well studied in the field of algebraic geometry associated with the theory of multivariate splines [16]. As a well-known result, Pascal’s theorem states a remarkable property of a hexagon inscribed in a conic. In the past few years, researchers have generalized Pascal’s theorem to various forms, including Chasles’s theorem and Cayley–Bacharach’s theorem concerning cubic curves. Shi and Wang obtain a generalization of Pascal’s theorem in high dimensional spaces [12], which involves the intersections of a quadratic hypersurface and a simplex [17]. Luo et al. generalize Pascal’s theorem from the perspective of a projective invariant, i.e., the characteristic number on planar algebraic curves [18,19]. Their generalization uses the characteristic number, as well as the derived characteristic mapping to establish the connection from an algebraic curve of higher degree to another one of lower degree, as Pascal’s theorem does to a conic and hexagon. The characteristic number reflects the intrinsic properties of the points on an algebraic curve/surface. These results are theoretically sound, but the dependency on the existence of the curve or surface curbs the wide application of the invariant to shape recognition and matching in computer vision.
In this work, we present a projective invariant named after the characteristic number of planar algebraic curves, but it relies no more on the existence of an algebraic curve. Similar to the invariant in [19], the characteristic number in this work also reveals an intrinsic property of an algebraic hypersurface or curve, which involves the intersections of this hypersurface or curve with the lines constituting a closed loop. The new definition also relates the fundamental invariant in projective geometry, i.e., cross-ratio, to the characteristic number. The cross-ratio of four collinear points becomes a special case of the characteristic number that relaxes the collinearity and number of the points for the cross-ratio. We employ the characteristic number to construct more informative shape descriptors that improve the performance of shape recognition, especially when severe affine and perspective deformations occur. In addition to the application to shape recognition, we incorporate the geometric constraints on facial feature points derived from the characteristic number into facial shape matching. This incorporation also renders accuracy and robustness to pose and view changes.
The rest of this paper is organized as follows. In Section 2, we define the characteristic number and give its properties. The descriptors invariant to perspective transformations, derived from the characteristic number, are given in Section 3. Section 4 shows the application of the characteristic number to facial feature matching. Finally, Section 5 concludes the paper.

2. Characteristic Number

Let K be a field and P m K be m-dimensional projective space over K . Assuming that H is a square matrix of order m + 1 , we call the mapping, Φ, from P m K to itself:
Φ : ( x 1 , x 2 , , x m + 1 ) T λ H ( x 1 , x 2 , , x m + 1 ) T , λ K , λ 0
a projective transformation, and all these transformations form a group. Projective geometry studies projective invariants whose geometric properties preserve under projective transformations. The cross-ratio is a fundamental projective invariant. When λ = 1 , a projective transformation degenerates to an affine transformation. In the context of computer vision [1], the geometric imaging process projecting 3D objects into an imaging view is typically modeled as a projective transformation, shown in Figure 1. An affine transformation is also acceptable if objects are distant from the camera compared with the focal length of the camera. Therefore, invariants to projective and affine transformations are quite important to computer vision problems, especially for shape analysis, including shape representation and matching [2,14].
Figure 1. Geometric imaging process. Three points, P , U , Q , of a line in the 3D space are projected as p , u , q and p , u , q into two imaging planes with the focal centers, O 1 and O 2 , respectively.
Figure 1. Geometric imaging process. Three points, P , U , Q , of a line in the 3D space are projected as p , u , q and p , u , q into two imaging planes with the focal centers, O 1 and O 2 , respectively.
Axioms 03 00202 g001
This section introduces a projective invariant, named the characteristic number (CN), that derives from an affine invariant, characteristic ratio. We prove the invariances of these two invariants and show how the characteristic number characterizes intrinsic properties of algebraic curves and/or hypersurfaces. We denote the intersection point of lines l and m as < l , m > and the line passing through points p and q as ( p , q ) in a projective plane, P 2 , otherwise stated below.

2.1. Characteristic Ratio: An Affine Invariant

Definition 1.
Let p , q P 2 be two different points (or lines) and p 1 , p 2 , , p k be distinct points (or lines) on the line ( p , q ) , then there must exist a i and b i , such that p i = a i p + b i q , ( i = 1 , 2 , , k ) . The ratio:
R ( p 1 , p 2 , , p k ) : = b 1 b 2 b k a 1 a 2 a k
is called the characteristic ratio (CR) of p 1 , p 2 , , p k to the basic points (or basic lines), p , q .
The computation of CR is independent of the choice of the basic points. It is natural and straightforward to take x and y coordinates in the image plane as a and b, respectively. The mathematical aspects and significance of this ratio to algebraic curves can be found in our previous publication [19]. Herein, we provide the proof of its invariance to affine transformations, upon which we build our descriptors for 2D planar shapes.
Theorem 2.
The characteristic ratio given by Definition 1 is an affine invariant.
Proof.
The characteristic ratio can be calculated on any positive number of collinear points ( k 1 ). We first prove the case when k = 1 . Assuming three points, x , y and v on a line, L, in the 3D world coordinate system, the points, p , q , u and p , q , u are their images (as shown in Figure 1) obtained by two cameras with the optical centers, O 1 and O 2 , respectively. It is well established that p , q , u are collinear and so are p , q , u . Any point in a line can be represented by the linear combination of other two points lying on the line, i.e.,
u = a p + b q , u = a p + b q
for the lines of the two views in Figure 1, and thus, we need to prove b a = b a .
We assume that there exists an affine transformation between the images as the lines determined by the 2D points in both image planes are the images (projections) of the 3D line, L. The corresponding points in two views satisfy:
u = H u , p = H p , q = H q
Substituting Equation (3) into Equation (2), we have H u = a H p + b H q . Since H is invertible, then we have:
u = a p + b q
Combining Equation (2) and Equation (4), we have b a = b a a , i.e., CR is an affine invariant when k = 1 .
Since every b k a k is affine invariant, CR with k multipliers of b k a k is an affine invariant when k > 1 . This establishes the theorem. ☐
Similar to the cross-ratio, the characteristic ratio reflects the geometric relationships of collinear points. Their difference lies in that the characteristic ratio has no limit on the number of the collinear points, while the cross-ratio defines on four collinear points. Hence, we are able to construct shape descriptors with more geometric information using the characteristic ratio on more points.

2.2. Characteristic Number: A Projective Invariant

The characteristic number can be regarded as a generalization of the cross-ratio, and we give the definition of the cross-ratio first. We adopt the convention that the upper symbol representing a point is also a column vector of the homogeneous coordinate for this point, and the lower ones are scalars. Let P 1 , P 2 , P 3 and P 4 be four collinear points and P 1 P 4 . We have the linear representations of P 2 and P 3 as a 1 P 1 + b 1 P 4 and a 2 P 4 + b 2 P 1 , respectively. The cross-ratio of these four points is:
( P 1 P 2 , P 3 P 4 ) : = a 1 b 1 · a 2 b 2
We define the characteristic number as follows.
Definition 3.
Let P 1 , P 2 , , P r - 1 , P r be r distinct points in P m K . On lines P i P i + 1 forming a loop ( P r + 1 : = P 1 and i = 1 , 2 , , r ), there are n points Q i ( 1 ) , Q i ( 2 ) , , Q i ( n ) distinct from P i and P i + 1 . Each Q i ( j ) can be linearly represented by P i and P i + 1 as:
Q i ( j ) = a i ( j ) P i + b i ( j ) P i + 1
Let P = { P i } i = 1 r and Q = Q i ( j ) i = 1 , 2 , , r j = 1 , 2 , , n , and define:
κ n ( P , Q ) : = i = 1 r j = 1 n a i ( j ) b i ( j )
as the characteristic number (of dimension m and degree n) of the point set, Q , with respect to the frame point set, P .
The following theorem shows that the invariance of CN to projective transformations.
Theorem 4.
The characteristic number is a projective invariant.
Proof:
Assume that a projective transform, Φ, projects P = { P i } i = 1 r to P = { P i } i = 1 r , and we have P i = Φ ( P i ) = k i H P i ( P r + 1 = P 1 , k i + 1 = k 1 ) . The points Q = Q i ( j ) i = 1 , 2 , , r j = 1 , 2 , , n are projected to Q = Q i ( j ) i = 1 , 2 , , r j = 1 , 2 , , n :
Q i ( j ) = Φ ( Q i ( j ) ) = l i ( j ) H Q i ( j ) = l i ( j ) H · ( a i ( j ) P i + b i ( j ) P i + 1 ) = l i ( j ) a i ( j ) k i · k i H P i + l i ( j ) b i ( j ) k i + 1 · k i + 1 H P i + 1 = l i ( j ) a i ( j ) k i P i + l i ( j ) b i ( j ) k i + 1 P i + 1
Thus, the characteristic number of transformed points is given by:
κ n ( P , Q ) = i = 1 r j = 1 n k i + 1 l i ( j ) a i ( j ) k i l i ( j ) b i ( j ) = i = 1 r k i + 1 k i j = 1 n a i ( j ) b i ( j ) = i = 1 r j = 1 n a i ( j ) b i ( j ) = κ n ( P , Q )
which indicates that the characteristic number is invariant under projective transformations. ☐
The characteristic number generalizes both the characteristic ratio and cross-ratio. The definition of CN in Equation (6) multiplies characteristic ratios, Equation (1), along a closed loop. Furthermore, the characteristic number of degree one of Q 1 ( 1 ) and Q 2 ( 1 ) with respect to the frame points, P 1 and P 2 , degenerates to the cross-ratio, Equation (5), of ( P 1 , P 2 , Q 2 ( 1 ) , Q 1 ( 1 ) ) when r = 2 and n = 1 , as shown in Figure 2a. In other words, the cross-ratio is the simplified case ( n = 1 ) of the characteristic number in the two-dimensional space ( r = 2 ). This paper focuses on the applications of the characteristic number to 2D shape analysis, where the characteristic number can incorporate the geometric information on more points (i.e., n 1 ) compared with the cross-ratio.
Figure 2. Points to calculate the characteristics number when (a) r = 2 and (b) r = 3 .
Figure 2. Points to calculate the characteristics number when (a) r = 2 and (b) r = 3 .
Axioms 03 00202 g002
It is also worth noting that the numbers of points on any line segments between frame points are identical in the definition of CN. We give the two simplest cases for 2D shape analysis to calculate CN when r = 2 and r = 3 . We consider two frame points, P 1 and P 2 , when r = 2 . Let Q i ( 1 ) , Q i ( 2 ) , , Q i ( n ) be n distinct points on the line segment, P 1 P 2 . For the case of even n, the odd-numbered points on the directed line, P 1 P 2 , and those even-numbered on P 2 P 1 are used to compute CN:
κ n ( P 2 , Q 2 ) = i = 1 n 2 a 1 ( 2 i - 1 ) b 1 ( 2 i ) b 1 ( 2 i - 1 ) a 1 ( 2 i )
where P 2 = { P 1 , P 2 } and Q 2 = { Q i ( 1 ) , Q i ( 2 ) , , Q i ( n ) } . For odd n, we calculate CN in the same way by using all the points on the line segment, expect for the last one.
When r = 3 , let P 3 = { P 1 , P 2 , P 3 } and Q i ( 1 ) , Q i ( 2 ) , , Q i ( n i ) be n i distinct points on the edge P i P i + 1 ( i = 1 , 2 , 3 ) of triangle, P 1 P 2 P 3 , as illustrated in Figure 2b. We choose the closest n = min ( n 1 , n 2 , n 3 ) points to P i on P i P i + 1 as the set Q 3 to calculate CN:
κ n ( P 3 , Q 3 ) = i = 1 3 j = 1 n a i ( j ) b i ( j )

2.3. Intrinsic Properties of a Hypersurface and Curve

In this section, we reveal an intrinsic geometric property of the points on an algebraic hypersurface or curve by using the characteristic number. This property is an extension of the well known Menelaus’s theorem and Carnot’s theorem in classical planar geometry and also a higher dimensional generalization of Serge [20] and Luo’s [18] results in a projective plane. Theorem 5 shows the intrinsic property of a hypersurface, and refer to [21] for the proof or the theorem that we omit here, due to the limitation of space.
Theorem 5.
Let P 1 , P 2 , , P r - 1 , P r be r distinct points in P m K . On lines P i P i + 1 ( P r + 1 : = P 1 , and i = 1 , 2 , , r ), there are n points Q i ( 1 ) , Q i ( 2 ) , , Q i ( n ) distinct from P i and P i + 1 . Some of these points may be the same. A hypersurface of degree n not through any P i intersects each line P i P i + 1 precisely in { Q i ( j ) } j = 1 n (multiple intersections are counted with multiplicities). Then, the characteristic number of { Q i ( j ) } i = 1 , , r j = 1 , , n with respect to the frame points P 1 , P 2 , , P r is ( - 1 ) r n . Conversely, if P 1 , , P m + 1 are linearly independent and the characteristic number of { Q i ( j ) } i = 1 , , r j = 1 , , n with respect to the frame points P 1 , P 2 , , P r is ( - 1 ) ( m + 1 ) n , then all the Q i ( j ) lie on a hypersurface of degree n not through any P i .
Figure 3 shows an example of Theorem 5 in the three-dimensional Euclid space. Each Q i ( j ) is distinct from the frame points, P i , and then all { Q i ( j ) } i = 1 , 2 , 3 , 4 j = 1 , 2 lie on a conic surface if and only if:
κ 2 ( { P i } i = 1 4 ; { Q i ( j ) } i = 1 , 2 , 3 , 4 j = 1 , 2 ) = i = 1 4 P i + 1 Q i ( 1 ) P i Q i ( 1 ) · P i + 1 Q i ( 2 ) P i Q i ( 2 ) = 1
where · denotes the directed length.
Theorem 5 degenerates to the results of the characteristic number for algebraic curves partially presented in [18,19] if m = 2 and r = 3 . These results reveal the intrinsic geometry of points on an algebraic curve that the characteristic number does not change with the selection of the base lines. As the simplest case in the two-dimensional space, the following corollary of Theorem 5, by setting r = 2 and n = 1 , reflects the fundamental collinearity in the multi-view computer vision. Corollary 6 shows that the characteristic number κ 1 ( P , Q , R ) = - 1 in Figure 3b, regardless of the choice of base lines, a , b and c, or points, U , V and W. The value of CN for these three points also remains unchanged no matter from what viewpoint we capture the images, as shown in Figure 1. We can apply this property given by the characteristic number to matching collinear points on human faces with pose changes.
Corollary 6.
The characteristic number of three collinear points is −1.
Theorem 5 also shows that this property given by the characteristic number no longer depends on the existence of of an algebraic curve. Thereafter, we are able to calculate the characteristic number for points not necessarily lying on an algebraic curve as that in [18]. This calculation yields projective invariants on more points than five in the 2D space. In [4], Goodall and Mardia construct projective invariants on six points by combining cross-ratios and applying the invariant to object recognition problems involving collinear sets of points. Herein, we derive the characteristic number for any six coplanar points, as given in Corollary 7. We can also employ this invariant in order to match points on human faces along with the characteristic number for three points in Corollary 6.
Figure 3. Theorem 5 reveals the intrinsic properties of a hypersurface or curve. (a) An example for a surface in the three-dimensional Euclid space; (b) the characteristic number of three collinear points is - 1 ; (c) the characteristic number on any six points derived from Theorem 5.
Figure 3. Theorem 5 reveals the intrinsic properties of a hypersurface or curve. (a) An example for a surface in the three-dimensional Euclid space; (b) the characteristic number of three collinear points is - 1 ; (c) the characteristic number on any six points derived from Theorem 5.
Axioms 03 00202 g003
Corollary 7.
Suppose that A , B , C , H , I and J are six points, any three of which are not collinear on a projective plane, as shown in Figure 3c. The characteristic number, defined as the product of ratios of directed triangle areas below, is a projective invariant.
κ ( A , I , C , H , B , J ) = S A B H S A C H . S B A I S B C I . S C A J S C B J
We can obtain the corollary by substituting A , B , C as the frame points P i and D , E , F as the Q i sets into Definition 3 and then applying Theorem 4. The areas in Equation (9) can be calculated by the determinants of the points’ homogenous coordinates:
κ = | A B H | | A C H | . | B A I | | B C I | . | C A J | | C B J |
where:
| A B H | = 1 1 1 x a x b x h y a y b y h
The determinant operator is widely available in any programming packages for scientific computing.

2.4. Generalization of Pascal’s Theorem

Classical Pascal’s theorem states the remarkable property of a hexagon inscribed in a conic. Various forms of generalizations emerge, but few of them appear in its original form. Luo et al. gave a generalization of Pascal’s theorem in the projective plane with a recursive structure in degrees, revealing the relation of curves of distinct degrees [18,19]. In this section, we present one generalization of Pascal’s theorem in a high-dimensional space.
Theorem 8.
Suppose m , n Z + and n m . P 1 , P 2 , , P m + 1 are m + 1 linear independent points in P m K . A hypersurface of degree n not through any P i intersects the lines P i P i + 1 ( i = 1 , 2 , , m + 1 and P m + 2 : = P 1 ) in the points Q i ( 1 ) , Q i ( 2 ) , , Q i ( n ) , where multiple intersections have distinct superscripts. For each j ( j = 1 , 2 , , m + 1 ) , the hyperplane, H j , through { Q i ( j ) } i j intersects the line P j P j + 1 in R j , and S j = χ ( P j , P j + 1 ) ( R j ) . Then, the ( m + 1 ) ( n - m + 1 ) points { S j } j = 1 m + 1 { Q i ( j ) } i = 1 , 2 , , m + 1 j = i , m + 2 , m + 3 , , n lie on a hypersurface of degree n - m + 1 not through any P i . (when n = m , we set Q i ( m + 1 ) : = Q i ( i ) and { Q i ( j ) } i = 1 , 2 , , m + 1 j = i , m + 2 , m + 3 , , n : = .)
This result establishes the connection from an algebraic hypersurface of higher degree to another one of lower degree as Pascal’s theorem and Luo’s results do to curves. Theorem 8 is equivalent to the results in [18,19] for planar curves when m = 2 . Furthermore, this theorem degenerates to Pascal’s theorem if the hypersurface (algebraic curve) is a conic. The proof of the theorem is given in [21], rather, due to the limit of the space, and we will provide its applications in future work.

3. Application I: A Perspective Invariant Shape Descriptor

We have used the characteristic ratio in Definition 1 to construct a shape descriptor invariant to affine transformations, named the characteristic ratio spectrum (CHARS) [22]. Compared with the descriptor using the cross-ratio (CRS) [7], our descriptor is able to incorporate more shape information, especially on geometric structures inside the symbol, as the cross-ratio in CRS can only use four collinear points for the calculation, but the characteristic ratio can combine as many as available. As shown in Figure 4d,e, most circles are left by the calculation of the cross-ratio. The loss of the information inside the symbol results in that CRS in Figure 4b cannot discriminate the symbols with subtle inner differences, as shown in Figure 4a. In contrast to the cross-ratio, the new characteristic ratio is able to make use of all available edge points given by the circles in Figure 4d,e and, thus, yields the spectra in Figure 4c showing the differences for the symbols. Refer to [22] for more details on descriptor construction and experimental comparisons on symbol recognition.
Similarly, we are able to construct a shape descriptor invariant to perspective deformations [8] using the characteristic number defined as Definition 3, which is expected to incorporate more information on the inner structures.
Figure 4. Comparison between characteristic ratio spectrum (CHARS) and cross-ratio spectrum (CRS). (a) Two symbols with subtle inner differences; (b) The cross-ratio spectra of the symbols; (c) The characteristic ratio spectra of the symbols; (d) and (e) illustrate the inner points used to calculate CHARS (circles) and CRS (red dots) of the symbols.
Figure 4. Comparison between characteristic ratio spectrum (CHARS) and cross-ratio spectrum (CRS). (a) Two symbols with subtle inner differences; (b) The cross-ratio spectra of the symbols; (c) The characteristic ratio spectra of the symbols; (d) and (e) illustrate the inner points used to calculate CHARS (circles) and CRS (red dots) of the symbols.
Axioms 03 00202 g004

3.1. Descriptor Construction

Similar to CRS and CHARS, we concatenate the values of invariants, the characteristic number here, to form the descriptor. As shown in Definition 3, we have to construct the closed loops to calculate CN. Let P = P s } s = 1 , 2 , , N ) denote the equidistant sample points numbered counter-clockwise on the convex hull of a given shape, S . We pick out three points in turn from P s as P i , P j , P k ( i = 1 , 2 , , N - 2 , j = i + 1 , i + 2 , , N - 1 , k = j + 1 , j + 2 , , N ) . The characteristic number κ ( P i , P j , P k ) = - 1 , if they are collinear. Otherwise, these three points form a triangle, and there are C N 3 triangles in total that cover the whole shape. As shown in Figure 5, each side of the triangle intersects the inner shape at several points regarded as Q i for the calculation of CN. We can obtain κ ( P i , P j , P k ) of the triangle P i P j P k by Equation (8). We concatenate the CN values of the triangles as the descriptor for the shape, S :
D 1 ( S ) = ( κ ( P i , P j , P k ) )
where i = 1 , 2 , , N - 2 , j = i + 1 , i + 2 , , N - 1 , k = j + 1 , j + 2 , , N . D 1 ( S ) is a vector of length C N 3 .
Figure 5. Points to calculate the characteristic number when (a) r = 2 and (b) r = 3 .
Figure 5. Points to calculate the characteristic number when (a) r = 2 and (b) r = 3 .
Axioms 03 00202 g005
In practical applications, one side of a triangle may have coincidence with the convex hull. False intersections are detected and used N - 2 times in this case. These intersections cannot reflect the inner structure of the shape correctly and, hence, introduce errors in shape matching. We discard the intersections whose distances to the convex hull are smaller than a given threshold.
Another practical issue arises when any sides of P i P j P k do not intersect the shape, S . The shape geometries are not available in the descriptor if we simply set κ ( P i , P j , P k ) to zero in this case, no matter how many intersections exist on the other sides. To address this issue, we define κ ( P i , P j , P k ) as follows assuming that the segment, P k P i , dose not intersect the shape without loss of generality.
  • κ ( P i , P j , P k ) = κ ( P i , P j ) · κ ( P j , P k ) if there are at least two intersections on both sides, P i P j and P j P k .
  • κ ( P i , P j , P k ) = κ ( P j , P k ) (or κ ( P i , P j ) ) if there are at least two intersections on the side, P j P k (or P i P j ), and no more than one intersection on the side, P i P j (or P j P k ).
  • κ ( P i , P j , P k ) = 0 if there is at most one intersection on either P i P j or P j P k .
Theoretically, the descriptor for one shape remains unchanged under projective transformations. Unfortunately, we may detect false intersections, due to significant deformations that do not preserve parallelism. In the following, we justify the applicability of the descriptor to complex shapes under perspective transformations by its properties.
  • The characteristic number on a triangle is permutable, i.e., κ ( P i , P j , P k ) = κ ( P j , P k , P i ) = κ ( P k , P i , P j ) . This can be be readily verified by Equation (8).
  • The choice of initial point (triangle) does not change individual values in D ( S ) , but determines the order in which CN values appear in D ( S ) . It is also straightforward to derive this property from the above and Equation (10).
  • Slight fluctuations of the vertices on the convex hull of S bring gradual changes on D ( S ) . It is assumed that three pairs of points, ( P i , P i ), ( P j , P j ) and ( P k , P k ), are neighbors on the smooth part of convex hull. We have κ ( P i , P j , P k ) κ ( P i , P j , P k ) , since each side of the triangles to calculate CN values is also close to each other.
  • The descriptor presents fluctuations in the case of affine transformations due to jags on the inner intersections. Severe perspective deformations would make it worse. A dynamic programming algorithm, i.e., dynamic time warping (DTW), is employed to align the shape descriptors of query and template shapes, as done in CRS and CHARS. This process can alleviate the deviations brought by the choice of the starting point.

3.2. Performance Evaluation

We evaluate the performance of the descriptor (CNF) derived from CN by comparing with recent CRS [7] and SIFT [23], the most successful feature descriptor so far. All experiments are conducted on a PC with a 2.3 GHz CPU and 4 GB memory. We follow exactly the same similarity metrics for descriptors and the DTW alignment process as CRS and CHARS in all our experiments.
We validate the proposed descriptor on 32 logos of television networks among which, similar shapes exist, as shown in Figure 6. Furthermore, many logos appear as relatively complex shape structures, especially within their convex hulls. The query sets are generated by changing the azimuth (az) and elevation (el) angles, as well as two factors ( α , β ) to indicate the degree of perspective deformations. The larger ( α , β ) are, the more severe is the perspective deformation. We have three degrees of deformations, i.e., ( α , β ) are ( 0 . 5 , 0 ) , ( 1 . 0 , 0 . 5 ) and ( 1 . 5 , 0 . 5 ) , and generate 16 types of (az, el) combinations for each degree. We generate binary contours by simply applying the Canny edge detector available in the MATLAB Image Processing toolbox. No extra smoothing or parameterization [24] is included.
Figure 6. Experimental set: (a) 32 logos of television networks; and (b) an example subject to perspective transformations with the perspective factors ( α , β ) = ( 0 . 5 , 0 ) and various ( a z , e l ) (azimuth (az) and elevation (el)) parameters.
Figure 6. Experimental set: (a) 32 logos of television networks; and (b) an example subject to perspective transformations with the perspective factors ( α , β ) = ( 0 . 5 , 0 ) and various ( a z , e l ) (azimuth (az) and elevation (el)) parameters.
Axioms 03 00202 g006
Table 1, Table 2 and Table 3 show the recognition rates given by the descriptors (CNF) from the characteristic number compared with CRS and SIFT under perspective deformations. It can be seen that CNF has a better performance than the other two methods in most projective transformations. CNF yields very high recognition rates when e l = 60 and 105 no matter what ( α , β ) is. When ( α , β ) = ( 0 . 5 , 0 ) , CNF shows superior performance on 13 out of 16 query subsets. The performance of CNF degrades as as ( α , β ) increases, but there are still 11 and 10 out of 16 query subsets showing higher rates when ( α , β ) = ( 1 . 0 , 0 . 5 ) and ( α , β ) = ( 1 . 5 , 0 . 5 ) , respectively. CNF is also a bit worse than CRS when the perspective deformations become significantly severe. In these cases, the triangle sides that generate our descriptor have significant coincidence with the convex hull, so that inaccurate intersections are detected, as shown in Figure 7. A simple threshold cannot eliminate all these inaccurate intersections that bring more side effects to our descriptor, as ours include more points than CRS does.
Table 1. Recognition rates on the query set when α = 0 . 5 and β = 0 . CNF, descriptor derived from the characteristic number; CRS, cross-ratio spectrum.
Table 1. Recognition rates on the query set when α = 0 . 5 and β = 0 . CNF, descriptor derived from the characteristic number; CRS, cross-ratio spectrum.
CNFCRSSIFT
az - 75 ° - 30 ° 15 ° 60 ° - 75 ° - 30 ° 15 ° 60 ° - 75 ° - 30 ° 15 ° 60 °
e l = 15 ° 43.75505059.385062.546.8846.886.259.386.2515.63
e l = 60 ° 93.7510096.8896.8887.584.3887.587.528.1365.6353.1325
e l = 105 ° 10010010010093.7590.6390.6387.512.571.8871.8821.88
e l = 150 ° 71.887587.57568.7584.3884.3868.756.2518.75253.13
Table 2. Recognition rates on the query set when α = 1 . 0 and β = 0 . 5 .
Table 2. Recognition rates on the query set when α = 1 . 0 and β = 0 . 5 .
CNFCRSSIFT
az - 75 ° - 30 ° 15 ° 60 ° - 75 ° - 30 ° 15 ° 60 ° - 75 ° - 30 ° 15 ° 60 °
e l = 15 ° 37.562.543.755053.1353.1334.3834.386.256.2506.25
e l = 60 ° 90.6310010078.1384.3884.3881.257512.531.25259.38
e l = 105 ° 96.8810093.7587.587.590.6387.571.8818.7546.8840.639.38
e l = 150 ° 56.257562.553.137578.1365.6365.6312.512.512.515.63
Table 3. Recognition rates on the query set when α = 1 . 5 and β = 0 . 5 .
Table 3. Recognition rates on the query set when α = 1 . 5 and β = 0 . 5 .
CNFCRSSIFT
az - 75 ° - 30 ° 15 ° 60 ° - 75 ° - 30 ° 15 ° 60 ° - 75 ° - 30 ° 15 ° 60 °
e l = 15 ° 31.2531.2537.531.2540.635034.3821.886.259.386.2515.63
e l = 60 ° 87.584.3884.3862.587.578.1378.1359.3828.1365.6353.1325
e l = 105 ° 87.593.7590.637578.1387.581.2571.8812.568.7568.7521.88
e l = 150 ° 43.7562.553.1346.8865.1368.7562.556.256.2518.75253.13
Figure 7. Significantly severe deformations bring inaccurate points into the new descriptor.
Figure 7. Significantly severe deformations bring inaccurate points into the new descriptor.
Axioms 03 00202 g007

4. Application III: Shape Matching with Characteristic Number

The characteristic number reflects the intrinsic properties of the algebraic curve given by several points, and these properties do not rely on the existence of the curve, as shown in Section 2.3. We are able to discover these properties through the characteristic number that is invariant to projective transformations and apply them as geometric constraints to the extraction of facial feature points under viewpoint or pose changes. The extraction problem is typically formulated as matching an appearance model with facial shape constraints. In this section, we use the local appearance model based on the principal component analysis (PCA) in [25,26] and substitute their PCA-based shape constraints with our facial priors discovered by the characteristic number. Our priors derived from CN are invariant to projective/perspective transformations and, hence, render the robustness to viewpoint or pose changes to the extraction algorithm.

4.1. Shape Priors Using Characteristic Number

Human faces are highly structured and present common geometries across the age, gender and ethnicity of individuals. For example, the four eye corners are collinear, and this collinearity preserves under pose/viewpoint changes. Researchers employ this invariant property of collinearity for pose recovery [27]. We intend to incorporate more geometric constraints than the collinearity.
The CN invariant is able to characterize more geometric information on faces in addition to the collinearity. We take an exhausted strategy on the CN values for the subsets of fiducial points in order to discover their common geometries. We enumerate all possible combinations of choosing three, five and six points from eight manually labeled fiducial points and calculate the CN value on every combination for all 515 images. Taking the discovery process of the collinearity using CN as an example, we have C 8 3 = 56 three-point combinations. Each combination generates 515 CN values for 515 frontal faces. Corollary 6 tells us that the CN value of three collinear points is a constant - 1 , so that we can pick out four combinations with three collinear points satisfying:
| ( - 1 ) - C N s u b | 2 < ε
where C N s u b is the CN of the three-point subsets and ε is a small positive constant. The blue bar in Figure 8a shows the histogram of the CN values for one three-point combination that satisfies Equation (11) on 515 frontal faces. Almost all the CN values of the points, whose locations are annotated as blue dots on the top frontal face in Figure 8b, are quite close to - 1 . We verify the invariance of the prior by using the same three-point combination from the other set of 515 uncontrolled faces with different poses and identities, one of which is given in the top image of Figure 8d. The CN values calculated from all these points are almost equal to - 1 , as we expect. These histograms verify that CN can find the collinear fiducial points on human faces, and the collinearity remains.
We perform the similar screening process on the combinations of five and six points, whose CN values are approximately identical for all 515 frontal faces:
| C - C N s u b | 2 < ε S d ( C N s u b ) < σ
where S d ( · ) denotes the standard deviation and σ is a small positive constant. The constant, C, is called the intrinsic value that characterizes the geometric property of the curve underlying six points. We find six combinations for five-point and six for six-point subsets that follow Equation (12). The histograms and point locations on frontal faces are given in Figure 8a,b, respectively. The CN values of one combination for all 515 frontal faces concentrate on one definite value. Again, Figure 8c,d verifies the projective invariances of CN on five (cross-ratio) and six points given by Corollary 7. These invariant priors, reported for the first time to the best of our knowledge, reflect common facial geometries similar to the collinearity, but on a larger scale, involving more points for more facial components. We can calculate these priors, as well as the collinearity with one formula, as Equation (6).
Figure 8. CN values for subsets with three (blue), five (red) and six (green) points: (a) histograms of CN values on the subsets of fiducial points whose locations are annotated in the frontal face images in (b), and (c) histograms of CN values on the same combinations of points as (a). The point coordinates for (c) are extracted from images in (d), significantly different from (b). Horizontal axes of the histograms are CN values, and the vertical axes are the number of faces.
Figure 8. CN values for subsets with three (blue), five (red) and six (green) points: (a) histograms of CN values on the subsets of fiducial points whose locations are annotated in the frontal face images in (b), and (c) histograms of CN values on the same combinations of points as (a). The point coordinates for (c) are extracted from images in (d), significantly different from (b). Horizontal axes of the histograms are CN values, and the vertical axes are the number of faces.
Axioms 03 00202 g008

4.2. Performance Evaluation

We use the Viola–Jones face detector [28] available at OpenCV to pick out the faces with both the eyes, nose and mouth presented. The detector can also output the regions of eyes and mouth for each detected face. We use these regions to roughly initialize the positions of eight fiducial points. The shape priors derived from CN are used as the constraints to match PCA-based appearance models on local patches around the fiducial points and, finally, localize these points.
We test our algorithm on facial images with a wide range of variations on poses, expressions and ages from a commercial set and several public face sets, including IMM-FACE-DB [29], LFW [30], AFLW [31] and Pointing’04 [32]. IMM-FACE-DB and Pointing’04 are the sets of medium scale under controlled environment, which categorizes facial images into identities and types of variations. LFW and AFLW have more than 10 , 000 facial images in the wild, and the commercial set complement the testing images with faces of young children. We randomly select 500 images from these sets of different environments for quantitative comparisons with the approach without and with partial shape constraints in order to demonstrate the generalization of the CN-based shape priors.
Figure 9. Fiducial point localization with pose changes, as well as variations on age, expression and resolution.
Figure 9. Fiducial point localization with pose changes, as well as variations on age, expression and resolution.
Axioms 03 00202 g009
We employ the normalized mean error (NME) as the objective metric for qualitative comparisons. This metric is widely accepted in comparative studies for facial point localization and alignment and is defined as:
m e = 1 n d l r i = 1 n d i
where n denotes the number of landmarks and d i values are the Euclidean point-to-point distances between the estimated locations and manually labeled ground-truth. The distance, d i , is normalized by n d l r , the distance between two pupils for each face.
Figure 9 shows the selected results of our algorithm on the testing images. Our algorithm works well on faces with pose and expression changes. The used shape constraints of the algorithm can help recover the accurate positions when glasses or partial occlusions appear in the facial images in Figure 9. The algorithm is also insensitive to low resolutions (LR), as long as the face detector can find a face from the LR image. The local facial appearance may change, but the global facial geometry preserves when a person is growing. Our CN-based shape priors, especially those on five and six points, can reflect the geometry at a larger scale and, thus, is applicable to images of babies and toddlers, though we discover the priors from adult faces. Figure 9 also demonstrates the accurate and robust localization of our algorithm on children images.
We show the impacts of the CN-based shape priors by comparing the localization accuracies when using collinearity (CN on three points), all CN constraints and no shape constraints. We calculate NMEs on 110 testing images selected from the data sets referred to above and plot the cumulative error distributions [33,34] of the three configurations in Figure 10. We can hardly reduce the errors lower than 0.15 for nearly 15% images if no geometric constraint is imposed on our optimization framework. The introduction of collinearity improves the accuracy of eye corner localization, so that the errors are down to about 0.12 for almost all images. More significantly, the errors are less than 0.1 for all the images when we combine more shape constraints derived from CN on five and six points. These improvements validate the use of our CN-based shape priors.
Figure 10. Cumulative error distributions of localizations with collinearity (three-point CN, red dots), all CN constraints (green solid and no shape constraints (blue dash dots). The x-axis is the normalized mean error (NME), and the y-axis indicates the percentage of images on which localization NMEs are lower than the x-value.
Figure 10. Cumulative error distributions of localizations with collinearity (three-point CN, red dots), all CN constraints (green solid and no shape constraints (blue dash dots). The x-axis is the normalized mean error (NME), and the y-axis indicates the percentage of images on which localization NMEs are lower than the x-value.
Axioms 03 00202 g010

5. Conclusions

In this paper, we present a projective invariant, named the characteristic number, defined on looped line segments. This invariant does not rely on the existence of the underlying algebraic surface or curves as its planar version in [18,19]. Compared with the classical cross-ratio, the invariant is able to include more non-collinear points, which paves the way for the possibilities of more informative descriptors and constraints. The computation of the characteristic number only involves simple multiplications and divisions of the point coordinates, as shown in Equation (6), and its complexity is linear to the number of points in the set, Q . We demonstrate the potentials of the invariant on the applications of shape recognition and matching under affine and perspective transformations. The new definition of the characteristic number also applies in a higher dimensional space.
The characteristic number defined in Equation (6) requires that the point numbers of the sets, Q , between any two frame points should be identical. Currently, we have to trim some inner points on one or several segments between frame points in order to meet the requirement as discussed in Section 3. This treatment is simple, but may sacrifice the descriptive ability. We are developing new invariants that relax this constraint. Furthermore, our invariant is applicable in a higher dimensional space, and hence, we will investigate how the invariant works for 3D shape analysis in the future.

Acknowledgments

This work is partially supported by the Natural Science Foundation of China under grant Nos. 61033012, 11171052, 61272371, 61003177 and 61328206 and the program for New Century Excellent Talents (NCET-11-0048).

Author Contributions

XF invented the idea of the applications of CN. ZL invented the original idea of the theory of CN. JZ proved the affine invariance of the characteristic ratio. XZ proved the intrinsic properties of CN. QJ implemented CHARS. DL implemented CNF. All the authors contribute to the writing.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision, 2nd ed.; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
  2. Weiss, I. Geometric invariants and object recognition. Int. J. Comput. Vis. 1993, 10, 207–231. [Google Scholar] [CrossRef]
  3. Lin, W.Y. Robust Geometrically Invariant Features for 2D Shape Matching and 3D Face Recognition. Ph.D. Thesis, University of Wisconsin-Madson, Madison, WI, USA, 2006. [Google Scholar]
  4. Goodall, C.R.; Mardia, K.V. Projective shape analysis. J. Comput. Graph. Stat. 1999, 8, 143–168. [Google Scholar]
  5. Suk, T.; Flusser, J. Point-based projective invariants. Pattern Recognit. 2000, 33, 251–261. [Google Scholar] [CrossRef]
  6. Quan, L.; Gros, P.; Mohr, R. Invariants of a pair of conics revisited. Image Vis. Comput. 1992, 10, 319–323. [Google Scholar] [CrossRef]
  7. Li, L.; Tan, C. Recognizing planar symbols with severe perspective deformation. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 755–762. [Google Scholar] [PubMed]
  8. Luo, Z.; Luo, D.; Fan, X.; Zhou, X.; Jia, Q. A shape descriptor based on new projective invariants. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Melbourne, Australia, 15–18 September 2013.
  9. Li, W.J.; Lee, T.; Tsui, H.T. Automatic feature matching using coplanar projective invariants for object recognition. In Proceedings of the 5th Asian Conference on Computer Vision (ACCV), Melbourne, Australia, 23–25 January 2002.
  10. Fan, B.; Wu, F.; Hu, Z. Robust line matching through line–point invariants. Pattern Recognit. 2012, 45, 794–805. [Google Scholar] [CrossRef]
  11. Yammine, G.; Wige, E.; Simmet, F.; Niederkorn, D.; Kaup, A. A novel similarity-invariant line descriptor for for geometric map registration. In Proceedings of the 20th IEEE International Conference on Image Processing (ICIP), Melbourne, Australia, 15–18 September 2013.
  12. Fan, X.; Qi, C.; Liang, D.; Huang, H. Probabilistic contour extraction using hierarchical shape representation. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV), Beijing, China, 17–21 October 2005; Volume 1, pp. 302–308.
  13. Wang, Z.; Liang, M.; Li, Y. Using diagonals of orthogonal projection matrices for affine invariant contour matching. Image Vis. Comput. 2011, 29, 681–692. [Google Scholar] [CrossRef]
  14. Bryner, D.; Klassen, E.; Le, H.; Srivastava, A. 2D affine and projective shape analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 99. [Google Scholar] [CrossRef] [PubMed]
  15. Riccio, D.; Dugelay, J.L. Geometric invariants for 2D/3D face recognition. Pattern Recogni. Lett. 2007, 28, 1907–1914. [Google Scholar] [CrossRef]
  16. Wang, R. Multivariate Spline Functions and their Applications; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2001. [Google Scholar]
  17. Shi, X.; Wang, R. The generalization of Pascal’s theorem and Morgan-Scott’s partition. Comput. Geom.: Lect. Morningside Center Math. 2003, 34, 179–187. [Google Scholar]
  18. Luo, Z.; Chen, L. The singularity of S μ + 1 μ ( Δ M S μ ) . J. Inf. Comput. Sci. 2005, 4, 739–746. [Google Scholar]
  19. Luo, Z.; Liu, F.; Shi, X. On singularity of spline space over Morgan-Scott’s type partition. J. Math. Res. Expo. 2010, 30, 1–16. [Google Scholar]
  20. Thas, J.; Cameron, P.; Blokhuis, A. On a generalization of a theorem of B. Segre. Geom. Dedicata 1992, 43, 299–305. [Google Scholar] [CrossRef]
  21. Luo, Z.; Zhou, X.; Gu, D.X. From a projective invariant to some new properties of algebraic hypersurfaces. Sci. China Math. 2014, in press. [Google Scholar] [CrossRef]
  22. Jia, Q.; Fan, X.; Luo, Z.; Liu, Y.; Guo, H. A new geometric descriptor for symbols with affine deformations. Pattern Recognit. Lett. 2014, 40, 128–135. [Google Scholar] [CrossRef]
  23. Lowe, D. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
  24. Srestasathiern, P.; Yilmaz, A. Planar shape representation and matching under projective transportation. Comput. Vis. Image Underst. 2011, 115, 1525–1535. [Google Scholar] [CrossRef]
  25. Cootes, T.F.; Taylor, C.J.; Cooper, D.H.; Graham, J. Active shape models—Their training and application. Comput. Vis. Image Underst. 1995, 61, 38–59. [Google Scholar] [CrossRef]
  26. Cootes, T.F.; Edwards, G.J.; Taylor, C.J. Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 681–685. [Google Scholar] [CrossRef]
  27. Gee, A.; Cipolla, R. Estimating gaze from a single view of a face. In Proceedings of the 12th IAPR International Conference on Pattern Recognition, Conference A: Computer Vision and AMP, Image Processing, Jerusalem, Palestine, 9–13 October 1994; Volume 1, pp. 758–760.
  28. Viola, P.; Jones, M.J. Robust real-time face detection. Int. J. Comput. Vis. 2004, 57, 137–154. [Google Scholar] [CrossRef]
  29. Stegmann, M.; Ersbll, B.; Larsen, R. FAME-a flexible appearance modeling environment. IEEE Trans. Med. Imaging 2003, 22, 1319–1331. [Google Scholar] [CrossRef] [PubMed]
  30. Huang, G.B.; Ramesh, M.; Berg, T.; Learned-Miller, E. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments; Technical Report 07-49; University of Massachusetts: Amherst, MA, USA, 2007. [Google Scholar]
  31. Koestinger, M.; Wohlhart, P.; Roth, P.M.; Bischof, H. Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In Proceedings of the First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies, Barcelona, Spain, 6–13 November 2011.
  32. Gourier, N.; Hall, D.; Crowley, J.L. Estimating face orientation from robust detection of salient facial structures. In Proceedings of the International Workshop on Visual Observation of Deictic Gestures, Cambridge, UK, 23–26 August 2004; pp. 1–9.
  33. Dibeklioglu, H.; Salah, A.; Gevers, T. A statistical method for 2-D facial landmarking. IEEE Trans. Image Process. 2012, 21, 844–858. [Google Scholar] [CrossRef] [PubMed]
  34. Cao, X.; Wei, Y.; Wen, F.; Sun, J. Face alignment by explicit shape regression. In Proceedings of the IEEE Conference on Biometrics Compendium Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012; pp. 2887–2894.

Share and Cite

MDPI and ACS Style

Fan, X.; Luo, Z.; Zhang, J.; Zhou, X.; Jia, Q.; Luo, D. Characteristic Number: Theory and Its Application to Shape Analysis. Axioms 2014, 3, 202-221. https://doi.org/10.3390/axioms3020202

AMA Style

Fan X, Luo Z, Zhang J, Zhou X, Jia Q, Luo D. Characteristic Number: Theory and Its Application to Shape Analysis. Axioms. 2014; 3(2):202-221. https://doi.org/10.3390/axioms3020202

Chicago/Turabian Style

Fan, Xin, Zhongxuan Luo, Jielin Zhang, Xinchen Zhou, Qi Jia, and Daiyun Luo. 2014. "Characteristic Number: Theory and Its Application to Shape Analysis" Axioms 3, no. 2: 202-221. https://doi.org/10.3390/axioms3020202

Article Metrics

Back to TopTop