Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection

Li, Yanqin; Zhang, Dehai

doi:10.3390/pr13020293

Open AccessArticle

Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection

by

Yanqin Li

¹ and

Dehai Zhang

^1,2,*

¹

Mechanical and Electrical Engineering Institute, Zhengzhou University of Light Industry, Zhengzhou 450002, China

²

Henan Key Laboratory of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou 450002, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(2), 293; https://doi.org/10.3390/pr13020293

Submission received: 3 January 2025 / Revised: 17 January 2025 / Accepted: 20 January 2025 / Published: 21 January 2025

(This article belongs to the Special Issue Simulation, Modeling, and Decision-Making Processes in Manufacturing Systems and Industrial Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

The traditional SIFT (Scale Invariant Feature Transform) registration algorithm is highly regarded in the field of image processing due to its scale invariance, rotation invariance, and robustness to noise. However, it faces challenges such as a large number of feature points, high computational demand, and poor real-time performance when dealing with large-scale images. A novel optimization method based on integral image technology and canny edge detection is presented in this paper, aiming to maintain the core advantages of the SIFT algorithm while reducing the complexity involved in image registration computations, enhancing the efficiency of the algorithm for real-time image processing, and better adaption to the needs of large-scale image handling. Firstly, Gaussian separation techniques were used to simplify Gaussian filtering, followed by the application of integral image techniques to accelerate the construction of the entire pyramid. Additionally, during the feature point detection phase, an innovative feature point filtering strategy was introduced by combining Canny edge detection with dilation operations alongside the traditional SIFT approach, aiming to reduce the number of feature points and thereby lessen the computational load. The method proposed in this paper takes 0.0134 s for Image type a, 0.0504 s for Image type b, and 0.0212 s for Image type c. In contrast, the traditional method takes 0.1452 s for Image type a, 0.5276 s for Image type b, and 0.2717 s for Image type c, resulting in reductions of 0.1318 s, 0.4772 s, and 0.2505 s, respectively. A series of comparative experiments showed that the time taken to construct the Gaussian pyramid using our proposed method was consistently lower than that required by the traditional method, indicating greater efficiency and stability regardless of image size or type.

Keywords:

SIFT algorithm; canny edge detection; image registration; Gaussian separation; integral image

1. Introduction

In numerous fields, such as medical image processing, remote sensing technology, pattern recognition, and defect detection, the crucial role of image registration technology is increasingly evident [1,2,3,4]. It involves computing the transformation model between different images to establish their corresponding geometric relationships, aligning the overlapping parts. Existing image registration methods can be classified into two main categories: intensity-based and feature-based methods [5,6]. Intensity-based image registration is renowned for its simplicity, intuitiveness, and computational efficiency, primarily relying on the pixel intensity information of images for registration. However, it is less effective when dealing with images with excessive noise or indistinct structures [7,8,9]. In contrast, feature-based registration methods precisely extract key features in images, such as edges and corners, and combine similarity measures with various constraints to ascertain the geometric transformation relationships between images. This leads to more accurate image registration, effectively overcoming the limitations of intensity-based methods. It demonstrates superior performance in complex scenarios with significant lighting changes or the need for multisensory data analysis, as illustrated in ref. [10,11].

The development of feature-based matching algorithms has led to many classic methods. In 1988, the Harris algorithm proposed by Harris et al. emerged as a classic feature point detection algorithm [12]. This algorithm extracts a moderate number of feature points with low computational load, robustness to image rotation, and changes in viewing angle, and has good noise resistance. However, it detects feature points at a single scale and lacks scale invariance, making it unsuitable for applications with significant image scale variations. In 2004, Lowe introduced the Scale Invariant Feature Transform (SIFT) algorithm, characterized by its ability to extract distinct features with scale invariance and robustness to noise and other interferences, hence its widespread use in the field of image processing. However, the SIFT algorithm generates a large number of feature points, requires extensive computation, and has long processing times, consuming substantial system resources, and resulting in poor real-time performance [13,14,15].

To address the issue of long computation time and poor real-time performance of the SIFT algorithm, Ke et al. introduced the PCA-based SIFT (PCA-SIFT) algorithm in the same year. Initially, for each image path, the gradient image vector is computed and reduced to a lower-dimensional feature vector using Principal Component Analysis (PCA). Based on the sub-pixel location, scale, and dominant orientation of keypoints provided by the SIFT algorithm, a 41 × 41 image block is extracted from the image, rotated to a canonical orientation, and projected onto the computed feature vector space to obtain a smaller and more compact feature vector. The Euclidean distance between two feature vectors is calculated to determine if they correspond to the same keypoint in different images. Although PCA-SIFT excels in computational efficiency and reduced storage requirements, it has some drawbacks. First, the PCA dimensionality reduction process, by selecting principal components (main directions of variance) in data, reduces dimensions but may overlook certain details and features, leading to information loss and affecting the algorithm’s robustness. Second, PCA is sensitive to outliers, diminishing matching performance in the presence of noise or anomalies. To better ensure uniform distribution of features in terms of location and scale, and to enhance the stability and accuracy of matching, Sedaghat et al. proposed the uniformly processed SIFT algorithm in 2011. The core idea of this algorithm is a novel SIFT feature selection strategy based on uniform distribution in location and scale, isolating feature quality through stability and uniqueness constraints. Initially, SIFT features with good distribution characteristics are extracted through the feature selection strategy. Then, a preliminary cross-matching process is introduced, and consistency is checked using a projective transformation model. However, the introduced uniform processing increases computational complexity in handling large-scale image data, resulting in poor real-time performance. Additionally, the uniformly processed SIFT algorithm is limited if the image has significant non-rigid deformations. In 2022, Yu et al. proposed a heterologous image matching algorithm based on an improved SIFT algorithm. Firstly, during feature point detection, grids are set for each layer in the scale space with weight coefficients, and a quadtree method is used in conjunction with the image’s phase response intensity map to select uniformly distributed and stable feature points. Secondly, descriptors are reconstructed, and standardized Euclidean distance is used to measure feature descriptors, employing a bidirectional matching strategy for coarse matching. Finally, the Random Sample Consensus (RANSAC) algorithm is used for refinement. Due to the introduction of multi-level processing and the quadtree method, the algorithm’s computational complexity is relatively high, posing challenges for large-scale image processing [16,17,18,19].

The above methods often involve global optimization techniques and many require high-performance computing environments to achieve reasonable processing times. However, they face challenges such as a large number of feature points, high computational demand, and poor real-time performance when dealing with large-scale images. A novel optimization method based on integral image technology and canny edge detection presented by this paper, its goal aims to maintain the core advantages of the SIFT algorithm while reducing the complexity involved in image registration computations, enhancing the efficiency of the algorithm for real-time image processing, and better adapting it to the needs of large-scale image handling. Firstly, Gaussian separation techniques were used to simplify Gaussian filtering, followed by the application of integral image techniques to accelerate the construction of the entire pyramid. Additionally, during the feature point detection phase, an innovative feature point filtering strategy was introduced by combining Canny edge detection with dilation operations alongside the traditional SIFT approach, aiming to reduce the number of feature points and thereby lessen the computational load. A series of comparative experiments showed that the time taken to construct the Gaussian pyramid using the improved SIFT method was consistently lower than that required by the traditional SIFT approach, indicating greater efficiency and stability regardless of image size or type.

Section 2, titled “Related Work”, synthesizes existing literature and identifies gaps that our research aims to address. Section 3, “Methodology”, details the theoretical framework and procedural steps of our study. Section 4, “Experimental Evaluation”, presents the results of our experiments and a thorough analysis. Finally, Section 5, which we have titled “Conclusion”, summarizes our findings, discusses their implications, and suggests avenues for future research.

2. Related Work

In order to simultaneously preserve the advantages of the SIFT algorithm, such as scale invariance, rotational invariance, illumination invariance, rich feature descriptors, high robustness, and resistance to noise interference, and to enhance the computational efficiency of image registration, this paper effectively optimizes two key stages within the SIFT algorithm, as shown in Figure 1.

Initially, in the Gaussian pyramid construction phase, the separability of Gaussian filtering and the characteristics of integral images were utilized to successfully accelerate the construction of the Gaussian pyramid. This not only preserves the advantages of the SIFT algorithm but also enhances computational efficiency, making the algorithm more suitable for processing large-scale images. Subsequently, in the feature point detection phase, the combination of Canny edge detection and dilation operations was employed to filter feature points, effectively improving the accuracy and reliability of the feature points. This not only enables the SIFT algorithm to more precisely locate feature points in images but also reduces unnecessary redundant calculations, thereby further optimizing computational efficiency.

Table 1 shows the differences between both methods.

The proposed method, which combines SIFT with Canny edge detection and integral image technology, addresses the challenges of the traditional SIFT algorithm by reducing computational complexity and improving real-time performance. The use of Gaussian separation techniques, integral image technology, and an innovative feature point filtering strategy makes the method more efficient and suitable for large-scale image processing and real-time applications.

3. Methodology

3.1. Improved Gaussian Pyramid Construction Based on SIFT

The SIFT algorithm is broadly divided into four steps: construction of the scale space, keypoint detection, determination of the main orientation of feature points, and generation of feature descriptors [25,26,27]. In the construction of the Difference in Gaussian (DOG) scale space, the first step is to construct the Gaussian scale space. The Gaussian scale space of an image can be obtained through convolution of the image with Gaussian kernels of varying scales. The convolution operation of a two-dimensional image

I (x, y)

with a convolution kernel can be represented as follows:

L (x, y, σ) = G (x, y, σ) * I (x, y),

(1)

where

L (x, y, σ)

represents the Gaussian smoothed image at a specific scale

σ

in the Gaussian pyramid, obtained by convolving the original image

I (x, y)

with a two-dimensional Gaussian kernel. is the two-dimensional Gaussian kernel, and the two-dimensional Gaussian distribution is defined as follows,

G (x, y, σ) = \frac{1}{2 π σ^{2}} e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}},

(2)

where

x

and

y

denote the offset from the mean in each of the two dimensions and

σ

is the standard deviation.

However, constructing the Gaussian pyramid often involves numerous convolution operations, leading to high computational complexity. To enhance computational efficiency, the construction of the Gaussian pyramid is accelerated by leveraging the separability of Gaussian filtering and the properties of integral images. The Gaussian filtering is first decomposed into horizontal and vertical directions, followed by the use of integral images to expedite the convolution process.

3.1.1. Gaussian Separation

The one-dimensional Gaussian distribution is given as follows:

G (x, σ) = \frac{1}{\sqrt{2 π} σ} e^{- \frac{x^{2}}{2 σ^{2}}},

(3)

where

x

is the offset from the mean and

σ

is the standard deviation.

The definition of the one-dimensional Gaussian distribution is extended to two dimensions, forming a two-dimensional Gaussian distribution. For simplicity, it is assumed that the standard deviation is equal in both directions.

(σ_{x} = σ_{y} = σ)

.

To separate variables from the two-dimensional Gaussian filter kernel, two one-dimensional Gaussian distributions are introduced as follows:

\begin{array}{l} G (x, σ) = \frac{1}{\sqrt{2 π} σ} e^{- \frac{x^{2}}{2 σ^{2}}} \\ G (y, σ) = \frac{1}{\sqrt{2 π} σ} e^{- \frac{y^{2}}{2 σ^{2}}} \end{array},

(4)

Representing the two-dimensional Gaussian distribution as the product of two one-dimensional Gaussian distributions:

G (x, y, σ) = G (x, σ) \cdot G (y, σ),

(5)

3.1.2. Accelerated Convolution of Integral Images

Convolution operation for 2D images with 2D Gaussian filter kernel:

R (x, y) = \sum_{i = - k}^{k} \sum_{j = - k}^{k} G (i, j, σ) \cdot I (x - i, y - j),

(6)

where

R (x, y)

denotes the convolution result,

G (i, j, σ)

denotes the value at position

(i, j)

in the two-dimensional Gaussian kernel,

I (x - i, y - j)

denotes the value of the original image at position

(x - i, y - j)

,

i, j

denotes the index within the Gaussian kernel, and

k

is the radius of the Gaussian kernel determined based on

σ

.

Now, first a one-dimensional convolution is performed in the horizontal direction,

R_{x} (t, y, σ) = \sum_{i = - k}^{k} G (i, σ) \cdot I (t - i, y),

(7)

where

t

denotes the index in the horizontal direction

Next, a one-dimensional convolution is performed in the vertical direction as follows:

R_{x} (y, x, σ) = \sum_{j = - k}^{k} G (j, σ) \cdot R_{x} (x, y - j),

(8)

An integral image is the cumulative sum of the pixel values of a two-dimensional image. Its definition is as follows:

S (x, y) = \sum_{i = 0}^{x} \sum_{j = 0}^{y} I (i, j),

(9)

Recursion of the integral image:

For the first line (

y = 0

):

S (x, 0) = \sum_{i = 0}^{x} I (i, 0),

(10)

For the first column (

x = 0

):

S (0, y) = \sum_{j = 0}^{y} I (0, j),

(11)

For a general case of

S (x, y)

, the calculation involves summing all the pixel values to the left and above the current point, and then subtracting the part that was double-counted in the upper-left corner. The formula is expressed as follows:

S (x, y) = S (x, y - 1) + S (x - 1, y) - S (x - 1, y - 1) + I (x, y),

(12)

In the formula,

S (x, y - 1)

and

S (x - 1, y)

are the integral image values to the left and above the current point, respectively.

S (x - 1, y - 1)

is the integral image value of the overlapping part in the upper-left corner, and

I (x, y)

is the pixel value of the current point.

Assuming that the sum of a rectangular area needs to be calculated, with the upper-left corner coordinates as

(A, B)

and the lower-right corner coordinates as

(C, D)

, then the sum of this rectangular area can be calculated using the following formula:

S u m = S (C, D) - S (C, B - 1) - S (A - 1, D) + S (A - 1, B - 1),

(13)

3.2. Based on the Traditional SIFT Algorithm Integrated with Canny Edge Detection

The SIFT algorithm detects feature points in images and extracts their descriptors, enabling the identification of similar feature targets across different images. However, in complex scenes, images may contain a large amount of detail and texture, leading to an excessive number of feature points detected by SIFT, which can slow down the registration process [28,29,30]. To enhance the performance of SIFT in registering images in complex scenes, this paper utilizes Canny edge detection to extract edge information from images and employs dilation operations. By reducing the number of feature points while ensuring registration accuracy, the speed of the algorithm is increased, and more stable image registration is achieved in complex settings.

The Canny edge detection algorithm operates by locating the local maxima of image gradients, using high and low thresholds to differentiate between strong and weak edges. This approach effectively minimizes the effects of noise and ensures the detection of genuine weak edges. The core process involves using an approximate Gaussian function with

f_{s} = f (x, y) \times G (x, y)

for smoothing the image, followed by employing a directional first-order derivative operator to pinpoint where the derivatives are maximized. After the smoothing process, the image gradients

f_{s} (x, y)

are approximated using a 2 × 2 first-order finite difference formula, as detailed in Equation (14).

\begin{array}{l} P [i, j] \approx (f_{s} [i, j + 1] - f_{s} [i, j] + f_{s} [i + 1, j + 1] - f_{s} [i + 1, j]) / 2 \\ Q [i, j] \approx (f_{s} [i, j] - f_{s} [i + 1, j] + f_{s} [i, j + 1] - f_{s} [i + 1, j + 1]) / 2 n \end{array},

(14)

By calculating the average of finite differences within a 2 × 2 square region, the partial derivatives of the gradients in the x and y directions at the same point in the image are obtained. The magnitude and angle of the gradient are then calculated by converting from Cartesian coordinates to polar coordinates.

\begin{array}{l} M [i, j] = \sqrt{P {[i, j]}^{2} + Q {[i, j]}^{2}} \\ θ [i, j] = \arctan (Q [i, j] / P [i, j]) \end{array},

(15)

where

M [i, j]

represents the edge strength in the image.

θ (i, j)

denotes the orientation of the edges.

M [i, j]

achieves a local maximum value at the direction angle

θ (i, j)

, which indicates the direction of the edge.

Next, the gradient magnitudes are subjected to Non-Maxima Suppression (NMS), which extracts pixels that have the greatest gradient magnitude along their respective gradient directions.

\begin{array}{l} ξ [i, j] = Sector (θ [i, j]) \\ N [i, j] = NMS (M [i, j], ξ [i, j]) \end{array},

(16)

where

ξ [i, j]

calibrates the gradient direction, dividing it into four ranges based on the size of the directional angle

θ (i, j)

, which can be labeled as 0, 1, 2, or 3. Non-maxima suppression is applied for each direction. If the gradient magnitude of a pixel’s neighboring pixels along its gradient direction is greater than that of the pixel itself, the pixel’s value will be set to zero.

Next, edge detection and connection are performed using a dual-threshold algorithm. This process involves applying two different threshold values to the image.

τ_{2} = 2 τ_{1},

(17)

where

τ_{2}

determines the edges, while

τ_{1}

tracks breaks in the edges. Pixels with gradient magnitudes greater than

τ_{2}

are definitively edges; those less than

τ_{1}

are definitively not edges; for pixels with gradient magnitudes between

τ_{1}

and

τ_{2}

, the decision to classify them as edges depends on whether there are neighboring pixels with values exceeding the higher threshold.

The process of integrating Canny edge detection with an improved traditional SIFT algorithm is illustrated in Figure 2. Initially, an image is loaded, followed by the detection of feature points using an algorithm that builds on the improved Gaussian pyramid construction of SIFT. Subsequently, the loaded image is converted to grayscale, and Canny edge detection is applied. The edges are then expanded through dilation operations. Finally, only those SIFT feature points located within the white areas of the Canny edge detection results are retained, effectively filtering out feature points from other areas. These filtered feature points are then plotted on the original image.

3.3. Image Alignment

Utilizing the improved SIFT algorithm, feature points of the original image, as well as rotated or translated images, are identified. For each feature point, the gradient magnitude and direction of its surrounding area are calculated. Then, based on the orientation and scale of the feature point, its feature descriptor is computed. The Euclidean geometric distance between two sets of feature descriptors is determined, as shown in Equation (18).

D (r, s) = ‖D_{r} - D_{s}‖ = \sqrt{\sum_{i = 1}^{n} {(D_{r} [i] - D_{s} [i])}^{2}}

(18)

where

r

and

s

represent the feature points of the rotated or translated image and the original image, respectively, while

D_{r}

and

D_{s}

represent the n-dimensional feature descriptors.

The two feature points with the smallest Euclidean geometric distance are selected, and their distance ratio r is calculated, as shown in Equation (19).

r = \frac{d_{1}}{d_{2}} < T,

(19)

where

d_{1}

and

d_{2}

are the Euclidean distances of the feature point to its nearest and next nearest neighbors, respectively, and

T

is the threshold. If the ratio

r

exceeds a specific threshold

T

, it is identified as an incorrect match and filtered out. Conversely, if the ratio

r

is less than the threshold

T

, it is deemed a correct match and retained.

The primary concept of the RANSAC algorithm is to obtain the optimal solution through iterative computation [31]. In image registration, the RANSAC algorithm is used to determine the coordinates of matching point pairs between images, thereby establishing the transformation relationship between the two images:

[\begin{matrix} x \\ y \\ 1 \end{matrix}] = [\begin{array}{l} h_{0} & h_{1} & h_{2} \\ h_{3} & h_{4} & h_{5} \\ h_{6} & h_{7} & 1 \end{array}] [\begin{matrix} x^{*} \\ y^{*} \\ 1 \end{matrix}],

(20)

In the equation,

(x, y)

and

(x^{*}, y^{*})

represent the feature points extracted from the two images, respectively. The steps of the algorithm are as follows:

Initialization of Matched Pair Set: Start by forming a set from the feature point pairs of the two local images, labeled as N; then randomly select four matched pairs from N.
Random Sample Selection: Since at least four points are needed to estimate an affine or perspective transformation, randomly choose four matched pairs from the set N.
Model Parameter Solution: Based on these four pairs of points, calculate the eight unknown parameters in transformation Equation (20) to form a preliminary transformation matrix H.
Error Calculation for Other Samples in Set N: Calculate the error $η$ for the other sample points in the set using the preliminary transformation matrix H. Set a threshold T; if the error $η$ is less than or equal to T, these points are identified as inliers, i.e., correct matches. Conversely, matches that do not meet this criterion are considered outliers.
Inlier Set Formation: Group the matched pairs with an error less than T into the inlier set S and recalculate the transformation matrix H* using the least squares method from Equation (20).
Iterative Optimization: Repeat the above process K times, retaining the transformation matrix with the highest number of inliers from each iteration. This matrix is considered the optimal spatial transformation model.

Use the RANSAC algorithm to further process the matched points after the first round of filtering, removing mismatched points and ensuring that at least four remaining matched points are available. With these four feature points, the transformation matrix between the two images can be calculated. The transformation matrix, as specified in Equation (20), uses affine transformation to map all pixels of the sampled image to the template image, thereby achieving registration between the two images.

4. Experimental Evaluation

To validate the effectiveness of the algorithm, the experiment will be conducted on a computer equipped with an Intel Core i5-10400F CPU, which is manufactured by Intel. Intel is headquartered in Santa Clara, CA, USA. 16 GB of memory, and an NVIDIA GeForce GTX 1660, which is manufactured by NVIDIA Corporation. NVIDIA is headquartered in Santa Clara, CA, USA.

4.1. Experiment on the Efficiency of the Improved SIFT Gaussian Pyramid Construction

The purpose of the experiment is to compare the performance of the improved SIFT Gaussian pyramid construction method with the traditional SIFT Gaussian pyramid construction method. The main parameters affecting the efficiency of Gaussian pyramid construction are σ (sigma), levels, and image size.

In the OpenCV implementation of the SIFT algorithm, the parameters for

σ

and levels are typically fixed. According to the official documentation, the default setting for

σ

is 1.6, and each octave in the scale space usually contains three levels. Therefore, the experiments adhere to this standard, setting

σ

= 1.6 and

l e v e l s = 3

. Another significant parameter that affects the efficiency of Gaussian pyramid construction is the image size. To further verify the effectiveness of the algorithm, experiments were conducted with three different types and sizes of images: Image a (924 × 785 pixels), Image b (1405 × 1985 pixels), and Image c (1252 × 976 pixels), as shown in Figure 3. To more comprehensively assess the impact of image size on the efficiency of Gaussian pyramid construction, each type of image underwent resizing operations at 0.3 times, 0.1 times, the original size 1 times, 1.5 times, and 1.7 times to generate datasets of various image sizes. This allows for a more comprehensive comparison between the performance of the traditional SIFT method for constructing Gaussian pyramids and the improved method under these varying image sizes, as shown in Table 2.

In the experiments with three different image types (a, b, c), based on Table 1 and Figure 4, it was observed that the time taken to construct the Gaussian pyramid using the improved SIFT method was generally lower than that of the traditional SIFT method. This indicates that the improved method is more efficient and stable regardless of image size or type. The specific analysis is as follows:

Impact of Scaling Ratio: For both methods, as the scaling ratio increased from 0.3 times to 1.7 times, the processing time also increased, reflecting the increased complexity in building the Gaussian pyramid due to more pixels in enlarged images. However, the growth in construction time for the Gaussian pyramid in large images was more gradual with the improved SIFT method, demonstrating its efficiency in processing large images.
Algorithm Adaptability and Stability: The improved SIFT method showed stable time growth across different image types and scaling ratios, indicating the adaptability and stability of the algorithm. In contrast, the traditional SIFT method had more significant time increases with specific image types and higher scaling ratios.
Suitability for Application Scenarios: Due to its high efficiency and stability, the improved SIFT method is particularly well-suited for scenarios requiring fast and efficient processing, such as real-time image processing or in resource-limited environments.

4.2. Experiment on Improved Keypoint Filtering Based on SIFT

The differences between the original sampled image and the registered image were quantitatively assessed using pixel-level comparisons, Structural Similarity Index (SSIM), Mean Squared Error (MSE), and histogram intersection indices. The total time taken from feature point detection to completion of registration was recorded for both the traditional SIFT method and the method combining traditional SIFT with Canny edge detection. These times were used to evaluate the efficiency of these methods in image registration, with the registration processes illustrated in Figure 5 and Figure 6.

Pixel-level Comparison: This involves calculating the absolute pixel differences between the rotated image and the post-registration image. Lower pixel difference values indicate better registration results.

F_{(I_{1}, I_{2})} = |I_{1} - I_{2}|,

(21)

Structural Similarity Index (SSIM): SSIM is a metric that measures the visual similarity between two images. The closer the SSIM value is to 1, the better the image registration effect. The formula for SSIM is as follows:

SSIM (x, y) = \frac{(2 μ_{x} μ_{y} + c_{1}) (2 σ_{x y} + c_{2})}{(μ_{x}^{2} + μ_{y}^{2} + c_{1}) (σ_{x}^{2} + σ_{y}^{2} + c_{2})},

(22)

where

u_{x}

and

u_{y}

are the mean values of images

x

and

y

, respectively.

σ_{x}^{2}

and

σ_{y}^{2}

are the variances of images

x

and

y

, respectively.

σ_{x y}

is the covariance of

x

and

y

.

c_{1}

and

c_{2}

are small constants added to avoid division by zero, typically taken as

c_{1} = {(k_{1} L)}^{2}

and

c_{2} = {(k_{2} L)}^{2}

, where

L

represents the dynamic range of the pixel values

k_{1} = 0.01

and

k_{2} = 0.03

.

Mean Squared Error (MSE): MSE is a commonly used method for measuring the difference between two images. The lower the MSE value, the better the registration effect. The formula for MSE is as follows:

MSE (I_{1}, I_{2}) = \frac{1}{M N} \sum_{i = 1}^{M} \sum_{j = 1}^{N} {(I_{1} (i, j) - I_{2} (i, j))}^{2},

(23)

where

M

and

N

are the dimensions of the image,

I_{1} (i, j)

and

I_{2} (i, j)

are the pixel values of the two images at position

(i, j)

.

Histogram Analysis: Histogram analysis typically involves comparing the histograms of two images. The higher the similarity, the better the registration effect. Histogram intersection is a common method for this analysis, and its calculation formula is as follows:

(H_{1}, H_{2}) = \sum_{i = 1}^{K} \min (H_{1} (i), H_{2} (i)),

(24)

where

H_{1}

and

H_{2}

are the histograms of the two images, and

K

is the number of bins in the histogram.

Experiment 3.1 demonstrates that the improved SIFT Gaussian pyramid construction method offers a significant speed advantage over the traditional SIFT algorithm. To further illustrate the enhancements of the improved feature point selection based on SIFT compared to the traditional SIFT algorithm, this experiment involved applying translation and rotation transformations to image a, as shown in Figure 7.

The analysis of Table 3, along with Figure 7, Figure 8 and Figure 9, reveals that the feature point filtering strategy employed in this study, which combines the SIFT algorithm with Canny edge detection for processing image a (924 × 785 pixels), focuses on selecting the most representative feature points while eliminating those that have a minimal impact on image registration. This selection mechanism significantly reduces the total number of feature points, thereby decreasing the data processing requirements for feature point matching and image registration steps. Consequently, this approach effectively shortens the processing time from feature point detection to image registration and enhances overall efficiency.

4.3. Discussion

It is noteworthy that even with a reduced number of feature points, the method presented in this study still demonstrates high standards on key performance indicators such as Structural Similarity Index (SSIM), pixel differences, Mean Squared Error (MSE), and histogram intersection indices, compared to traditional methods. Particularly in experiments involving translation distances, pixel differences were significantly reduced, while performance indices remained at optimal levels, highlighting the improvements in image registration accuracy achieved by this method. This finding indicates that despite reducing the volume of data processed, the quality and accuracy of image registration have not been compromised by using selectively chosen feature points.

The study in question employs a feature point filtering strategy that combines the SIFT algorithm with Canny edge detection for processing images. This approach is particularly applied to an image of size 924 × 785 pixels. The primary goal of this strategy is to select the most representative feature points while eliminating those that have minimal impact on image registration. This selection mechanism significantly reduces the total number of feature points, thereby decreasing the data processing requirements for feature point matching and image registration steps. Consequently, this approach effectively shortens the processing time from feature point detection to image registration and enhances overall efficiency.

Overall, this study successfully demonstrates how to optimize the image registration process by precisely reducing the number of feature points without sacrificing quality. This approach is crucial for handling large-scale image data, especially in applications such as real-time video analysis or high-speed image processing systems, where it shows immense potential for application.

5. Conclusions

This paper successfully implements improvements to the SIFT algorithm, specifically through the optimization of Gaussian pyramid construction and the effective integration of Canny edge detection in feature point filtering. These enhancements not only speed up the construction of the Gaussian pyramid and increase the overall computational efficiency of the algorithm but also effectively reduce the number of feature points, thereby decreasing computational complexity and enhancing registration accuracy. Experimental results demonstrate that compared to the traditional SIFT algorithm, our method exhibits higher efficiency and better performance in the construction phase of the Gaussian pyramid when processing image registration tasks of varying sizes and types. Moreover, in the feature point detection phase, the combination of Canny edge detection and dilation operations effectively improves the accuracy and reliability of the feature points. This not only allows the SIFT algorithm to more precisely locate feature points within images but also reduces unnecessary redundant calculations, thereby increasing efficiency.

Overall, the results of this research not only enrich the theoretical applications of the SIFT algorithm but also hold significant practical value, particularly in fields requiring fast and accurate image registration, such as real-time video analysis and high-speed image processing systems. We plan to further optimize the performance of the algorithm by exploring more advanced techniques for Gaussian pyramid construction and feature point filtering. This includes the integration of deep learning techniques to enhance the robustness of feature detection and the use of hardware acceleration to speed up the image registration process.

These future directions will continue to advance the field of image registration and make the SIFT algorithm more applicable to a wider range of real-world scenarios.

Author Contributions

Conceptualization, D.Z. and Y.L.; D.Z. software, Y.L.; data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number No. 52006201; Henan Provincial Science and Technology Research Project under Grant number No. 242102230034.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant number No. 52006201, and the Henan Provincial Science and Technology Research Project under Grant number No. 242102230034.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, H.; Li, X.R.; Zhao, L.Y.; Chen, S.H. A Novel Coarse-to-Fine Scheme for Remote Sensing Image Registration Based on Sift and Phase Correlation. Remote Sens. 2019, 11, 1833. [Google Scholar] [CrossRef]
Li, J.W.; Wu, X.Y.; Liao, P.H.; Song, H.H.; Yang, X.M.; Zhang, Z.R. Robust Registration for Infrared and Visible Images Based on Salient Gradient Mutual Information and Local Search. Infrared Phys. Technol. 2023, 131, 104711. [Google Scholar] [CrossRef]
Liang, H.D.; Liu, C.L.; Li, X.G.; Wang, L.N. A Binary Fast Image Registration Method Based on Fusion Information. Electronics 2023, 12, 4475. [Google Scholar] [CrossRef]
Lu, J.Y.; Jia, H.G.; Li, G.; Li, Z.Q.; Ma, J.Y.; Zhu, R.F. An Instance Segmentation Based Framework for Large-Sized High-Resolution Remote Sensing Images Registration. Remote Sens. 2021, 13, 1657. [Google Scholar] [CrossRef]
Liu, J.L.; Bu, F.L. Improved RANSAC Features Image-Matching Method Based on SURF. J. Eng.-JoE 2019, 23, 9118–9122. [Google Scholar] [CrossRef]
Lazar, E.; Bennett, K.S.; Hurtado Carreon, A.; Veldhuis, S.C. An Automated Feature-Based Image Registration Strategy for Tool Condition Monitoring in CNC Machine Applications. Sensors 2024, 24, 7458. [Google Scholar] [CrossRef] [PubMed]
Li, B. Application of Machine Vision Technology in Geometric Dimension Measurement of Small Parts.EURASIP. J. Image Video Proc. 2018, 127, 1–8. [Google Scholar] [CrossRef]
Xin, Z.H.; Wang, H.Y.; Qi, P.Y.; Du, W.D.; Zhang, J.; Chen, F.H. Printed Surface Defect Detection Model Based on Positive Samples. Comput. Mater. Contin. 2022, 72, 5925–5938. [Google Scholar] [CrossRef]
Ben-Zikri, Y.K.; Helguera, M.; Fetzer, D.; Shrier, D.A.; Aylward, S.R.; Chittajallu, D.; Niethammer, M.; Cahill, N.D.; Linte, C.A. A Feature-based Affine Registration Method for Capturing Background Lung Tissue Deformation for Ground Glass Nodule Tracking. Computer methods in biomechanics and biomedical engineering. Imaging Vis. 2022, 10, 521–539. [Google Scholar] [CrossRef]
Kuppala, K.; Banda, S.; Barige, T.R. An Overview of Deep Learning Methods for Image Registration with Focus on Feature-Based Approaches. Int. J. Image Data Fusion 2020, 11, 113–135. [Google Scholar] [CrossRef]
Razaei, M.; Rezaeian, M.; Derhami, V.; Khorshidi, H. Local Feature Descriptor using Discrete First and Second Fun-Damental Forms. J. Electron. Imaging 2021, 30, 023008. [Google Scholar] [CrossRef]
Chang, H.-H.; Chan, W.-C. Automatic Registration of Remote Sensing Images Based on Revised Sift with Trilateral Computation and Homogeneity Enforcement. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7635–7650. [Google Scholar] [CrossRef]
Harris, C.G.; Stephens, M. A Combined Corner and Edge Detector. In Proceedings of the Alvey Vision Conference, Manchester, UK, 1 January 1988; pp. 147–151. [Google Scholar] [CrossRef]
Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
Zheng, S.Y.; Zhang, Z.X.; Zhang, J.Q. Image Relaxation Matching Based on Feature Points for DSM Generation. Geo-Spat 2004, 7, 243–248. [Google Scholar] [CrossRef]
Ke, Y.; Sukthankar, R. PCA-SIFT: A More Distinctive Representation for Local Image Descriptors. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2004, Washington, DC, USA, 27 June–2 July 2004. [Google Scholar] [CrossRef]
Alhwarin, F.; Ristić-Durrant, D.; Gräser, A. VF-SIFT: Very Fast SIFT Feature Matching. In Pattern Recognition (DAGM 2010); Lecture Notes in Computer Science, Volume 6376; Goesele, M., Roth, S., Kuijper, A., Schiele, B., Schindler, K., Eds.; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar] [CrossRef]
Sedaghat, A.; Mokhtarzade, M.; Ebadi, H. Uniform Robust Scale-Invariant Feature Matching for Optical Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. IEEE 2011, 49, 4516–4527. [Google Scholar] [CrossRef]
Yu, Z.W.; Zhang, N.; Pan, Y.; Zhang, Y.; Wang, Y.X. Heterogeneous Image Matching Based on Improved SIFT Algorithm. Laser Optoelectron. Prog. 2022, 59, 1211002. [Google Scholar] [CrossRef]
Zhu, F.; Zheng, S.; Wang, X.; He, Y.; Gui, L.; Gong, L. Real-Time Efficient Relocation Algorithm Based on Depth Map for Small-Range Textureless 3D Scanning. Sensors 2019, 19, 3855. [Google Scholar] [CrossRef] [PubMed]
Yang, J.; Xiao, Y.; Cao, Z. Toward the Repeatability and Robustness of the Local Reference Frame for 3D Shape Matching: An Evaluation. IEEE Trans. Image Process 2018, 27, 3766–3781. [Google Scholar] [CrossRef]
Bonnaffe, W.; Coulson, T. Fast fitting of neural ordinary differential equations by Bayesian neural gradient matching to infer ecological interactions from time-series data. Methods Ecol. Evol. 2023, 14, 1543–1563. [Google Scholar] [CrossRef]
Yang, J.; Zhang, Q.; Xiao, Y.; Cao, Z. TOLDI: An effective and robust approach for 3D local shape description. Pattern Recognit. 2017, 65, 175–187. [Google Scholar] [CrossRef]
Petrelli, A.; Di Stefano, L. Pairwise registration by local orientation cues. Comput. Graph. Forum 2016, 35, 59–72. [Google Scholar] [CrossRef]
Liu, Y.Y.; He, M.; Wang, Y.Y.; Sun, Y.; Gao, X.B. Fast Stitching for The Farmland Aerial Panoramic Images Based on Optimized SIFT Algorithm. Trans. CSAE 2023, 39, 117–125. [Google Scholar] [CrossRef]
Paul, S.; D., U.; Naidu, Y.; Reddy, Y. An Efficient SIFT-based Matching Algorithm for Optical Remote Sensing Images. Remote Sens. Lett. 2022, 13, 1069–1079. [Google Scholar] [CrossRef]
Gao, S.P.; Xia, M.; Zhuang, S.J. Automatic Mosaic Method of Remote Sensing Images Based on Machine Vision. Comput. Opt. 2024, 48, 705–713. [Google Scholar] [CrossRef]
Liu, Y.Y.; He, M.; Wang, Y.Y.; Sun, Y.; Gao, X.B. Farmland Aerial Images Fast-Stitching Method and Application Based on Improved SIFT Algorithm. IEEE Access 2022, 10, 95411–95424. [Google Scholar] [CrossRef]
Tang, L.; Ma, S.H.; Ma, X.C.; You, H.R. Research on Image Matching of Improved Sift Algorithm Based on Stability Factor and Feature Descriptor Simplification. Appl. Sci. 2022, 12, 8448. [Google Scholar] [CrossRef]
Sundani, D.; Widiyanto, S.; Karyanti, Y.; Wardani, D.T. Identification of Image Edge Using Quantum Canny Edge Detection Algorithm. J. ICT Res. Appl. 2019, 13, 133–144. [Google Scholar] [CrossRef]
Niedfeldt, P.C.; Beard, R.W. Convergence and Complexity Analysis of Recursive-RANSAC: A New Multiple Target Tracking Algorithm. IEEE Trans. Autom. Control 2016, 61, 456–461. [Google Scholar] [CrossRef]

Figure 1. Improved SIFT Algorithm.

Figure 2. Integration of Traditional SIFT Algorithm with Canny Edge Detection.

Figure 3. Images of three different types and sizes. (a) Image a (924 × 785 pixels); (b) Image b (1405 × 1985 pixels); (c) Image c (1252 × 976 pixels).

Figure 4. Performance of the improved SIFT Gaussian pyramid construction method and traditional SIFT Gaussian pyramid construction with images of different sizes.

Figure 5. Image registration using the traditional SIFT algorithm.

Figure 6. Image registration using the SIFT algorithm combined with canny edge detection.

Figure 7. Displays a comparative illustration of image keypoints before and after filtering: (a) Number of keypoints before filtering in image type a (3165). (b) Number of keypoints after filtering in image type a (2480). (c) Number of keypoints before filtering in 15-degree rotated image (3685). (d) Number of keypoints after filtering in 15-degree rotated image (2994). (e) Number of keypoints before filtering in the (15, 15) translated image (3124). (f) Number of keypoints after filtering in the (15, 15) translated image (2429).

Figure 8. Comparison of rotation parameters in images: (a) Comparison of registration time consumption between the method in this paper and traditional methods. (b) Comparison of difference values between this paper’s method and traditional methods. (c) Comparison of SSIM between this paper’s method and traditional methods. (d) Comparison of histogram intersection index between this paper’s method and traditional methods. (e) Comparison of MSE between this paper’s method and traditional methods.

Figure 9. Translation image parameter comparison: (a) Registration time consumption comparison between this paper’s method and traditional methods. (b) Difference value comparison between this paper’s method and traditional methods. (c) SSIM comparison between this paper’s method and traditional methods. (d) Histogram intersection index comparison between this paper’s method and traditional methods. (e) MSE comparison between this paper’s method and traditional methods.

Table 1. The differences between the traditional SIFT method and the improves SIFT method [20,21,22,23,24].

	The Traditional SIFT Gaussian Pyramid Construction Method with Images of Different Sizes	The Improved SIFT Gaussian Pyramid Construction Method
feature point generation	generates a large number of feature points, robust but computationally expensive	reduces the number of feature points using Canny edge detection and dilation, maintaining robustness.
Gaussian pyramid construction	computationally intensive, involving multiple scales and octaves.	simplified using Gaussian separation techniques and accelerated with integral image technology
real-time performance	struggles to achieve real-time performance, especially for large-scale images	significantly improved real-time performance, suitable for large-scale images
time efficiency	high computational demand, long processing times.	reduced time for constructing the Gaussian pyramid and detecting feature points
accuracy	high accuracy in feature point detection and image registration.	maintains high accuracy while reducing computational load.
comparative experiments	consistently slower, especially for large images	consistently faster, regardless of image size or type

Table 2. Performance of the Improved SIFT Gaussian Pyramid Construction Method and the Traditional SIFT Gaussian Pyramid Construction Method with Images of Different Sizes.

Image Type	Image Resolution Size (pixel)	Time for the Improved SIFT Gaussian Pyramid Construction Method				Time for the Traditional SIFT Gaussian Pyramid Construction Method
Image Type	Image Resolution Size (pixel)	1	2	3	Average Time (s)	1	2	3	Average Time (s)
a	277 × 235	0.0010	0.0010	0.0010	0.0010	0.0121	0.0127	0.0127	0.0125
	462 × 392	0.0020	0.0020	0.0020	0.0020	0.0307	0.0301	0.0314	0.0307
	924 × 785	0.0090	0.0099	0.0110	0.0100	0.1108	0.1099	0.1099	0.1102
	1386 × 1177	0.0239	0.0239	0.0239	0.0239	0.2481	0.2559	0.2471	0.2504
	1570 × 1334	0.0299	0.0309	0.0299	0.0302	0.3196	0.3186	0.3287	0.3223
	Average Time for Image Type a				0.0134	Average Time for Image Type a			0.1452
b	421 × 595	0.0021	0.0023	0.0020	0.0021	0.0396	0.0379	0.0399	0.0391
	702 × 992	0.0174	0.0160	0.0177	0.0170	0.1042	0.1029	0.1016	0.1029
	1405 × 1985	0.0383	0.0374	0.0313	0.0357	0.4054	0.4116	0.4214	0.4128
	2107 × 2977	0.0937	0.0937	0.0816	0.0897	0.9129	0.9247	0.9347	0.9241
	2388 × 3374	0.1094	0.1038	0.1094	0.1075	1.1615	1.1656	1.1499	1.1593
	Average Time for Image Type b				0.0504	Average Time for Image Type b			0.5276
c	375 × 292	0.0010	0.0011	0.0010	0.0010	0.0214	0.0213	0.0224	0.0217
	626 × 488	0.0022	0.0022	0.0031	0.0025	0.0537	0.0534	0.0533	0.0535
	1252 × 976	0.0284	0.0288	0.0293	0.0288	0.2070	0.2075	0.2091	0.2079
	1878 × 1464	0.0312	0.0306	0.0327	0.0315	0.4668	0.4778	0.4733	0.4726
	2128 × 1659	0.0402	0.0433	0.0424	0.0420	0.6038	0.6032	0.6013	0.6028
	Average Time for Image Type c				0.0212	Average Time for Image Type c			0.2717

Table 3. Data comparison between improved keypoint selection based on SIFT and traditional methods in rotation and translation.

Rotation Angle (°)		15	25	35	45	(15, 15)	(35, 35)	(−15, −15)	(−35, −35)
Number of keypoints after filtering	Image a	2994	2900	2957	3071	2429	2358	2382	2317
Number of keypoints before filtering	Image a	3685	3604	3638	3689	3124	3031	3087	2974
Time from feature point detection to image registration	The method of this paper	0.26	0.26	0.26	0.26	0.24	0.23	0.23	0.24
Time from feature point detection to image registration	Traditional method	0.32	0.33	0.34	0.34	0.31	0.30	0.30	0.30
Pixel difference	The method of this paper	3,953,015	4,800,217	5,569,518	6,336,485	0	0	0	0
Pixel difference	Traditional method	4,000,647	4,762,046	5,671,722	6,241,591	10,853	23,716	0	0
SSIM	The method of this paper	0.99	0.99	0.98	0.97	1.0	1.0	1.00	1.00
SSIM	Traditional method	0.99	0.99	0.98	0.97	1.0	1.0	1.00	1.00
MSE	The method of this paper	30.62	34.11	37.07	39.50	0.0	0.0	0	0
MSE	Traditional method	30.62	34.13	37.54	39.24	0.11	0.16	0	0
Index	The method of this paper	0.93	0.93	0.93	0.93	1.00	1.00	1.00	1.00
Index	Traditional method	0.93	0.93	0.93	0.93	1.00	1.00	1.00	1.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Zhang, D. Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection. Processes 2025, 13, 293. https://doi.org/10.3390/pr13020293

AMA Style

Li Y, Zhang D. Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection. Processes. 2025; 13(2):293. https://doi.org/10.3390/pr13020293

Chicago/Turabian Style

Li, Yanqin, and Dehai Zhang. 2025. "Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection" Processes 13, no. 2: 293. https://doi.org/10.3390/pr13020293

APA Style

Li, Y., & Zhang, D. (2025). Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection. Processes, 13(2), 293. https://doi.org/10.3390/pr13020293

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Improved Gaussian Pyramid Construction Based on SIFT

3.1.1. Gaussian Separation

3.1.2. Accelerated Convolution of Integral Images

3.2. Based on the Traditional SIFT Algorithm Integrated with Canny Edge Detection

3.3. Image Alignment

4. Experimental Evaluation

4.1. Experiment on the Efficiency of the Improved SIFT Gaussian Pyramid Construction

4.2. Experiment on Improved Keypoint Filtering Based on SIFT

4.3. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI