2. Methods
Arslan [
16] introduces a geometric feature that associates objects with identifiable corners. By considering any triangle, the unique coordinates of its corner points can be derived based on the distances to these corners. The process involves predicting all corner points by matching triangles and checking all possible triangles.
We summarize the geometric feature, the definitions, and the algorithm in [
16]. These provide the foundation for the new ideas and the algorithm we propose in this work.
The key observation in this approach is that the Euclidean distances from the corners of a triangle uniquely identify a point in 2D.
Figure 1 illustrates a query object and the search image in which this object is being searched. Only the corner points are extracted and considered. The images are repeated in the figure beneath the original images, now also displaying the detected corner points. All possible triangles and their pairwise matches are considered in object detection. In the case shown in
Figure 1, one triangle (among many possible ones) is shaded, serving as a reference. A different corner point is selected to illustrate that it can be identified by the distances to the corners (vertices) of the triangle in the query object. These are shown by the dashed lines in the figure. Similarly, the corresponding point in the search image can be identified by using a matching triangle in the search image. There are two instances of the query image in the search image, labeled as 1 and 2. This occurs in both cases. The described process for one corner point applies similarly to all other corner points. Collectively, all corners, and hence all instances 1 and 2 are identified in this example.
Let denote the triangle whose vertices are indexed by , and whose internal angles are in clockwise order. For any two triangles and , let indicate that and are similar ( they have the same internal angles and proportional side lengths).
A triangle feature descriptor of Q with triangle is defined as , where denotes the Euclidean distance between two points. We use to refer to the set P in this definition.
A triangle feature descriptor is a local feature descriptor for Q that predicts and yields a match for Q. A set is defined for the predicted and verified points for a potential match of Q in I. For , let denote the set of points predicted based on in I, and triangle feature descriptor for Q. More precisely, this can be expressed as follows:
Definition 1. For and such that , is identified by for some . For , .
In other words, is a set of predicted points, each of which corresponds to a point in Q for a potential match of Q in I. If the calculated ratio is greater than or equal to a given threshold ratio H in , then, the subset is considered an instance of Q in I.
The objective is to find all instances of Q in I, which can be described as follows: such that .
We summarize Algorithm 1 for the object detection problem. It takes as input a set
Q of corner points for the query object and a set
I containing corner points belonging to objects in the searched image. The threshold
H is a parameter in
used to distinguish significant matches from insignificant findings.
Algorithm 1: Algorithm for finding all matches of Q in I, summarized from [16]. |
![Mathematics 13 00925 i001]() |
In Lines 1–3, the algorithm calculates all internal angles for the triangles in Q and I and assigns them to bins , which are created using discretization parameters. These bins contain triangles from Q and I, referred to as and , respectively.
In Lines 4–9, the algorithm iterates over all discrete angle values. For each set of angles, all matching pairs of triangles in the Cartesian product of and are considered. The corresponding geometric feature, as defined in Definition 1, is then computed in Line 5. Based on this feature, the predicted position of each corner point in I is determined. If a sufficient number of point matches are found in Line 6, a match for Q in I is confirmed and reported in Line 7.
The total time complexity of Algorithm 1 is , where is the number of similar triangle pairs in Q and I. We note that the resulting time complexity is sensitive to the output size. The values of C and N are predefined for the algorithm, and the maximum value of depends on C and N. Consequently, the time requirement can be adjusted by tuning the parameters C and N. The parameter C controls the granularity of angles, where represents the size of an angular bin within which all angles are treated as equivalent. The parameter N sets an upper bound on the number of points considered in the computation. In the experiments reported, the typical values are , for the query and for the searched image. The value of is often in the order of several thousand. However, the actual time complexity, , can be significant, as may grow as large as . To see this, note that there can be similar triangles in both Q and I, implying in the worst case. We encountered such instances during our tests.
A careful study of the space requirement of Algorithm 1 shows that its space complexity is , as there can be triangles from Q and I in each of the bins. While this complexity can be mitigated by imposing limits on corner-laden objects, its practicality diminishes when dealing with images containing numerous objects with many corners, exacerbating both time and space complexities. In particular, the space requirement can become prohibitive. Although in many cases these parameters are not very large, the space complexity remains the most limiting factor when a high number of corners needs to be detected and processed in input images. It is not surprising that we encountered instances where Algorithm 1 failed to complete due to insufficient memory.
To address these challenges, our paper proposes a novel approach that leverages Delaunay triangulation, considering only the triangles generated by the Delaunay triangulation, unlike in [
16]. The number of triangles considered is linear with respect to the number of corner points, offering a substantial reduction in complexity. We demonstrate that focusing solely on the triangles within the Delaunay triangulation suffices for many practical cases in which a certain condition holds. The result is a more feasible and efficient object recognition algorithm, as validated by our empirical results. There are examples where our new method produces correct results efficiently, while Algorithm 1 cannot be executed due to insufficient memory.
Another contribution of our new algorithm is its efficient handling of rigid object transformations by considering all vertex permutations of all triangles in Q. This approach does not impact the asymptotic time or space complexity of the algorithm, as the number of considered triangles remains small relative to the input size.
For a given set of discrete points
P, a Delaunay triangulation, denoted by
[
18], is a triangulation in which no point in
P lies inside the circumcircle of any triangle in
. For any triangle in
, its circumcircle does not contain any other point in
P.
satisfies several interesting properties. For example, in
for any
P, the minimum angle among all the triangles is maximized. The Delaunay triangulation of
N points has
(approximately
) triangles.
Figure 2 illustrates an example problem instance in which a query object and a search image contain two matches to this query, labeled as 1 and 2. The Delaunay triangulations for both the query and the image are shown beneath them. Algorithm 1 examines all possible triangles and their pairwise matches to identify all matches for the query. In contrast, the newly proposed algorithm focuses only on triangles within the Delaunay triangulations and their pairwise matches. In this case, the query matches are successfully identified as long as at least one
correct triangle is found and used as a reference for each query match. The Euclidean distances from the corners of a triangle uniquely identify a point in 2D. In
Figure 2, after aligning each common pair of triangles indicated by the dashed lines, we calculate the Euclidean distances from the query’s vertices to the corners of their reference triangle. Using these distances, along with the scaling factor between the matching triangles, we determine the positions of all other vertices in a potential match within the search image. In
Figure 2, each of the two shaded triangles yields instances labeled as 1 and 2 of the query in the search image.
For a set of points in the plane, the Delaunay triangulation is unique if and only if no four points are cocircular, meaning that the circumcircle passing through any three points does not contain any other points from the set. In such cases, the Delaunay triangulation is well-defined and unique.
Let Q be the set of corner points for a query object and I be the set of corner points obtained from an input image.
A triangle
t is a visible triangle of
Q in
I if
t is in
and there exists a
matching triangle in
. Two triangles are considered a match if they are similar (i.e., if they have the same angles). A triangle
t is a
yielding triangle for an instance of
Q in
I if
t is a visible triangle of
Q in
I, and the vertices predicted and verified using the geometric feature, with
t as a reference, yield an instance of
Q in
I. In
Figure 2, two triangles in
are highlighted with shaded interiors. The dashed lines connect these triangles to their corresponding matching triangles in
. The highlighted triangles in
are yielding triangles. It is important to note that other yielding triangles exist, but only two are shown for simplicity of illustration.
If an instance of Q in I contains all vertices of Q and does not overlap with any other object, then all triangles of Q can be found as visible triangles in I. In our proposed method, any of these triangles will yield a correct match. However, if the Delaunay triangulation for Q is not unique, there can still be a visible triangle in an instance of Q in I. A necessary and sufficient condition for detecting this instance is the existence of a yielding triangle from whose circumcircle does not encompass any point belonging to another (overlapping) object, including another instance of Q. This assumption is reasonable, as the absence of such a visible triangle precludes the possibility of detecting object instances within the given image. Furthermore, without this assumption, it is unclear whether any method could reliably determine or suggest the presence of an object instance.
We note that in Q, every triangle formed by three vertices whose circumcircle does not contain any other vertex will be part of according the definition of Delaunay triangulation. For a given triangle in , if a similar triangle is detected in , it may indicate a local feature that could lead to a match with Q in I. At least some local features must be preserved in images when a match exists. More formally, a matching pair of triangles in and can serve as a potential reference for an instance of Q in I.
The concepts of visible and yielding triangles are used solely for analyzing the performance of our proposed method. The existence of a yielding triangle in for an instance of Q in I implies that we find this instance if we examine all matching pairs of triangles in and as references. Our proposed algorithm is based on comparing triangles in and and determining whether they yield matches. By using a visible triangle in I and a matching triangle in Q as a reference, the proposed method can predict all other corners of the queried object for a potential instance of Q in I. Under certain practical assumptions, this approach identifies the same matches while requiring significantly less time and space. In some cases, the proposed method produces results even when Algorithm 1 fails to run due to its excessive memory requirements.
Let
and
be two rigid objects in 2D, represented as sets of points. We consider transformations on rigid objects [
19]. We say that
and
in 2D are
geometrically isomorphic if and only if there exists a rigid transformation
T such that
, where
T is one of the following:
translation, a shift of all points in
by a fixed vector
v;
rotation, a rotation of all points in
around a fixed point (e.g., the origin) by an angle
;
reflection, a mirroring of all points in
across a line of reflection, or a combination of these, for example, a reflection followed by a translation. These transformations preserve the geometry of rigid objects, ensuring that the distances and angles between all points remain unchanged. We also extend
T to include the transformation resize, which scales
by a positive real factor
r. After applying operations in
T on
, the angles and distance ratios are preserved. For any three points
, the angles between the vectors in
are preserved in
between the vectors for the corresponding points
, respectively. For any two points
, and the corresponding points
, respectively, the ratio of the distance between
and
to the distance between
and
is preserved. This holds for all pairs of points in the same object.
If and are geometrically isomorphic, then and contain the same triangles, with the corresponding vertices, identical angles, and proportional side lengths.
Definition 2. Compute M(f(Q),I,H), where denotes applying the transformation f from to Q.
The formulation in Definition 2 considers all possible transformations of Q. Any triangle in Q can be chosen as a reference. Applying the transformation to Q is equivalent to transforming the reference triangle and computing the corresponding points to match with respect to this transformed triangle. We compute this using the set of triangles obtained from the Delaunay triangulation and their possible transformations. Algorithm 1 utilizes the geometric feature defined in Definition 1 and implements an approach that is invariant to both scaling and rotation. Scaling invariance is achieved by computing the scaling factor between two matching triangles and uniformly applying this factor to all distances.
To achieve invariance under all transformations in T, we consider all permutations of a matching triangle in with a given triangle in whenever a new pair of matching triangles is identified.
Applying T to all possible triangles in Q collectively results in the same effect as applying T directly to Q. A rigid transformation T composed of translation, rotation, and reflection preserves distances and angles. When T is applied to the set of points Q, it moves each point in Q while maintaining the geometric relationships among them. The resulting set of points is . If we consider all possible triangles formed by the points in Q, these triangles are subsets of Q. Applying T to each triangle involves applying T to the three vertices of the triangle, resulting in a new triangle that has undergone the same transformation as the individual points. When T is applied to every triangle in Q, each triangle undergoes the same rigid transformation, resulting in the transformed set of triangles. The “collective total effect” of applying T to all possible triangles in Q is equivalent to transforming Q as a whole. Since the triangles are subsets of Q, transforming all triangles corresponds to transforming all points in Q. Thus, . Applying T to all possible triangles in Q collectively is equivalent to applying T directly to Q. The transformation of the points in Q inherently transforms all the triangles formed by those points, preserving the geometric structure.
Our observation is that any transformation based on a reference triangle can be applied to the entire Q using this reference triangle and the geometric feature in Definition 1, as all angles and length ratios are preserved under these transformations. Comparing Q and for is equivalent to comparing Q and the portion of Q associated with the reference triangle g found in I after applying f to g. This comparison can also be achieved by accounting for all possible configurations of g.
For a given , let be the permutation set . We note that for any rigid transformation of , and for any , where , is in . Based on this, we reformulate the objective in Definition 2 as follows:
Definition 3. Compute such that .
We perform preprocessing before searching for a rigid object in an image. To extract the corner points
Q and
I from the query and search input images, respectively, we use the Shi–Tomasi Corner Detector [
20] implemented in the following Python version 3.11 function: cv.goodFeaturesToTrack(). This function applies a slight modification to the Harris Corner Detector and produces better results. It is based on computing the minimum eigenvalue of the structure tensor (second-moment matrix). The dominant operations involve computing image gradients and eigenvalue calculations, both of which run in linear time. Thus, its time and space complexities are
, where
M is the number of pixels. From the detected corner points, we use the Delaunay triangulation implementation in Python [
21] to generate Delaunay triangulations. This implementation is based on an incremental convex hull algorithm for triangulation. For
N input points in 2D, the Delaunay triangulation is computed in expected time
and in worst-case time
using
space. Corner detection is a crucial step in both Algorithm 1 and our newly proposed Algorithm 2. Although Delaunay triangulation is an additional preprocessing step for Algorithm 2, its computational cost is not significant in the overall complexity of our approach. We propose an object matching algorithm that takes as input the Delaunay triangulations
and
, generated from
Q and
I, respectively.
By applying Delaunay triangulation, we obtain a list of vertices along with their neighboring vertices. Using this information, we construct a list of triangles, generating and . As in Algorithm 1, we assume that both and are , and therefore, and are also .
Algorithm 2 takes Q, I, , and as input. It finds and returns all matches of Q in I based on the assumption that defines the object Q. Every match of found in corresponds to an instance of Q in I.
The input threshold H is a parameter in that is used to separate significant matches from other insignificant partial matches.
The loop in Lines 1–10 considers each triangle
in
I, and compares it with all permutations
of every triangle
in
in the loop in Lines 2–9. Line 4 computes the set
of predicted points in
I, as described in Definition 1, using the dynamic programming approach which was also used in Algorithm 1. This computation is performed by first matching the two triangles,
and
, and then extending the match by one verified point at a time. The position of each new point in
I is predicted based on the triangle feature descriptor and subsequently verified. This iterative process is a key aspect that characterizes our algorithm as a dynamic programming algorithm.
Algorithm 2: Algorithm for finding all matches of Q in I. |
![Mathematics 13 00925 i002]() |
From the similar triangles
and
, the scale ratio is computed as
. For each point
, the transformed coordinates
are determined using the scaled distance triplet
along with the vertices of
. This transformation is achieved by finding the intersection of three circles centered at
, with respective radii
, and
. This is done by creating a system of three quadratic equations and solving them. In this work, we improved the solution of the resulting system of equations by developing a different sequence of intermediate calculations, as shown in
Appendix A, making the solutions more numerically robust compared to those in Algorithm 1. For equality tests involving real numbers, we consistently applied tolerance values throughout our implementation.
The coordinates of are computed with an error tolerance, adjusted dynamically based on the scaling factor, for each pair of similar triangles. Each computed is then checked within I to determine if there exists a point in I within the tolerance distance. If such a point is found, it is considered a corresponding match for x in a potential alignment. Initially, every similar triangle pair contributes three point matches due to the alignment of their corner points.
Line 5 first evaluates the total number of matched points, including additional pairs beyond the triangle corners, and verifies whether the condition is satisfied. We note that since verification is done in I, the computed set is described as the intersection of , and I. If the tested condition holds, Line 6 reports this set as a new match. It is easy to see that all elements of the objective set in Definition 3 will be reported in this manner.
The loop in Lines 1–10 iterates times. Similarly, the loop in Lines 2–9 iterates times, too. The loop in Lines 3–8 iterates 6 times. In Line 4, solving a system of three quadratic equations for each point requires constant time. Therefore, Line 4, as well as Lines 5 and 6, each run in time . Thus, we conclude that the overall time complexity of the algorithm is . The space requirement of the algorithm is since the number of triangles considered from Q and I is . The pairwise comparisons and matching-related computations are performed using space. This represents a significant improvement over Algorithm 1, which has a space complexity of , where C is a discretization factor for angles (e.g., ).
A post-processing of this set can be performed to report all instance of Q in I. In this work, we follow the same objective. However, we propose a method that does not consider all possible triangles in Q and I.
Unlike Algorithm 1, the new algorithm does not consider every possible triangle from Q and I. Provided that a triangle in is visible in I and included in , the new algorithm will not miss any instances of Q in I. This holds true for all practical purposes. For this to fail, objects would need to overlap at every triangle in Q formed by adjacent neighbors. Under such conditions, it is unclear whether an instance of Q exists in I at all.