Next Article in Journal
Internal Corporate Social Responsibility as a Microfoundation of Employee Well-Being and Job Performance
Next Article in Special Issue
Vehicle Price Classification and Prediction Using Machine Learning in the IoT Smart Manufacturing Era
Previous Article in Journal
An Empirical Study for Senior Citizens Using a Customized Medical Informatics System for Dementia Diagnosis and Analysis
Previous Article in Special Issue
Cluster-Based Routing Protocol with Static Hub (CRPSH) for WSN-Assisted IoT Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Cognitive IoT Vision System Using Weighted Guided Harris Corner Feature Detector for Visually Impaired People

1
Department of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Coimbatore 641114, India
2
Department of Computer Science and Engineering, Faculty of Engineering and Technology, M.S. Ramaiah University of Applied Sciences, Bengaluru 560054, India
3
Department of Computer Science and Engineering, Amity School of Engineering & Technology, Amity University Uttar Pradesh, Noida 201313, India
4
School of Computer Science, Kyungil University, Gyeongsan 38424, Korea
*
Authors to whom correspondence should be addressed.
Sustainability 2022, 14(15), 9063; https://doi.org/10.3390/su14159063
Submission received: 16 June 2022 / Revised: 15 July 2022 / Accepted: 21 July 2022 / Published: 24 July 2022

Abstract

:
India has an estimated 12 million visually impaired people and is home to the world’s largest number in any country. Smart walking stick devices use various technologies including machine vision and different sensors for improving the safe movement of visually impaired persons. In machine vision, accurately recognizing an object that is near to them is still a challenging task. This paper provides a system to enable safe navigation and guidance for visually impaired people by implementing an object recognition module in the smart walking stick that uses a local feature extraction method to recognize an object under different image transformations. To provide stability and robustness, the Weighted Guided Harris Corner Feature Detector (WGHCFD) method is proposed to extract feature points from the image. WGHCFD discriminates image features competently and is suitable for different real-world conditions. The WGHCFD method evaluates the most popular Oxford benchmark datasets, and it achieves greater repeatability and matching score than existing feature detectors. In addition, the proposed WGHCFD method is tested with a smart stick and achieves 99.8% recognition rate under different transformation conditions for the safe navigation of visually impaired people.

1. Introduction

In India, visual impairments are a major problem and the number of visually impaired people is 12 million according to the statistics of the World Health Organization (WHO). Secure and safe navigation is one of the most challenging actions handled by vision-impaired people in the real-life environment. Hence, assistive devices help them to complete their daily tasks including uninterrupted navigation. However, confirming secure and safe navigation for visually impaired people is a difficult task that involves precision and effectiveness; one of the major problems that is faced by visually impaired people is accurately recognizing an object that is near to them [1].
With the advancement of technology like the Internet of Things (IoT), machine vision, mobile devices, and Artificial Intelligence (AI), several assistive devices are designed for visually impaired people to support their perfect navigation. A few researchers have started to utilize a camera to identify and recognize impediments in real time and provide the information for efficient navigation and safe guidance. This method gives the correct information about their surroundings; however, it fails to recognize an object when it undergoes various transformations such as scale, illumination, rotation, JPEG compression, and viewpoint change.
The object recognition task aims to identify an object from an image or video, and a common technique for object recognition is to extract local features from images and match them against features from other images. Recently, a Smart vision system was implemented to help visually impaired people using technologies like image processing, IoT, and machine learning,. Many systems use computer vision to detect indoor and outdoor objects accurately.
The system identifies an object from an image using an advanced Tensor flow model that has been trained by using Microsoft Common Objects in Context (MSCOCO) Dataset [1]. Another system identifies several types of objects by utilizing the MobileNet model to extricate the features of the input image [2]. RetinaNet is a type of deep networking model used to extract features from images to identify and recognize an object [3]. The above method achieves better recognition, but if the training time and memory storage are greater it can lead to delay in that process. In recent times, deep learning feature extraction methods have been developed to train the images to provide feature points that are robust to distortion. Even though they have a specific degree of scale invariance by learning, they fail in cases with a significant change in scale, illumination, etc.
Finding the best features is still an open problem in computer vision tasks. The primitive feature extraction task is an elementary role in the efficiency of high-level descriptors in machine vision jobs. The problematic task is to build descriptors for better distinctiveness using the features invariant to scale, rotation and viewpoint. The reliability of feature-based object recognition mainly depends on the robustness of the selected feature detector and descriptor. The primary goal of using a local feature detector is to describe an object effectively by performing feature matching between two images. To achieve this goal a good feature extractor should extract repeatable and precise feature points to describe the same object in an image, distinctive in extracted features that can be localized reliably under varying conditions such as scale, illumination, rotation, noisy, compression, etc.
In this paper, a robust and stable Weighted Guided Harris Corner Feature Detector (WGHCFD) is proposed to recognize an object from the captured image, and the effectiveness of the proposed work is compared with the existing feature detectors and descriptors. In addition, the proposed WGHCFD is tested with a smart walking stick for object recognition tasks, and it provides safe navigation for visually impaired people.
The organization of the paper is as follows: Section 2 describes the literature review on local feature extraction for the object recognition task. Section 3 describes the proposed vision system for visually impaired people when the detailed object recognition module using the proposed WGHCFD detector is implemented. Section 4 presents the experimental results with the standard datasets and object recognition application results. Section 5 concludes this work and gives the scope for the future.

2. Related Work

In this section, we discuss the various local feature extraction methods for object recognition tasks in computer vision.
Accurately recognizing an object in an image is still a challenging task, especially when it has complicated structures like changes in shape, location, size, and orientation. The local feature extraction method successfully handles changes in scale, rotation, and the illumination of objects in an image. Great strides have been made in the field of local invariant feature detection during the last few years, but there are still many avenues that need to be pursued and remain challenging for vision researchers.
SIFT is a well-known real-time local feature detector and descriptor that efficiently extracts image features under different image transformations. The traditional SIFT feature extraction method has four stages to detect highly distinctive features from an image that are suitable for application in machine vision tasks like identification, detection, and the tracking of objects. SIFT descriptors are invariant to pose, cluttered background, scale, illumination, partial occlusion, and viewpoint changes [4]. Hence, SIFT features are commonly used in many real-time applications like image stitching [5], object recognition [6], object tracking [7], object detection [8], object classification [9], and forgeries detection systems [10], etc.
SIFT has a wide variety including Geometric Scale-Invariant Terrain Feature Transform (GSIFT) [11], Principal Component Analysis—SIFT (PCA-SIFT) [12], Colored SIFT (CSIFT) [13] and Affine invariant SIFT (ASIFT) [14]. The intuition of SIFT algorithms provides less capability to real-time system performance by ensuring high efficiency of the SIFT algorithms. Common feature point detectors are: Harris and Hessian–Laplace detectors [15]; FAST [16]; STAR [17]; Speeded-Up Robust Features (SURF) [18]; KAZE [19] and Binary Robust Invariant Scalable Keypoints (BRISK) [20]; Oriented FAST; and Rotated BRIEF (ORB) [21]. Table 1 shows the comparison of various feature detectors with their invariant properties for scale changes and rotation changes image in the existing method.
The performance of the SIFT algorithm has been increased by using a Bilateral filter in place of a Gaussian filter. Bilateral in SIFT (Bi-SIFT) preserves edge characteristics better than Gaussian filters in the SIFT algorithm. Bi-SIFT was found to improve the feature detection algorithm and is robust to various image transformations. The proposed result illustrates that Bi-SIFT provides higher repeatability and a greater number of correspondences for all imaging conditions [22]. The trilateral filter in the SIFT algorithm gives a better performance for all geometric transformation images, yet it fails to provide higher repeatability and score matching due to halo artifacts problems [23]. Fast Feature Detector (FFD) [24] formulates the superimposition problem by assigning the blurring ratio to 2 and smoothness to 0.627 in a scale-space pyramid of DoG in SIFT algorithm. The proposed feature detector FFD takes a computational time of only 5% that of SIFT and is much more accurate and reliable than SIFT.
From Table 1, Harris–Laplace, Hessian–Laplace, SIFT (DoG), SURF (Fast Hessian), ORB, BRISK, STAR, KAZE, Bi-SIFT, and Tri-SIFT detectors have the invariance properties of scale and rotation. Hence, these detectors are commonly considered for comparison with the detectors proposed in this paper. The detectors like Harris, Shi–Tomasi, SUSAN, FAST, and FFD fail to provide invariance to both scale and rotation changes. Table 2 shows the comparison of the most popular feature detectors and descriptors by scale space, orientation, descriptor generation, feature size, and robustness.
Common feature descriptors like SIFT, GLOH [15], SURF, BRIEF [25], ORB, BRISK, FREAK [26], DaLI [27], Bi-SIFT, DERF [28], HiSTDO [29], Tri-SIFT, RAGIH [30], ROEWA [31], DOG–ADTCP [32], and LPSO [33] are studied in this research. GLOH and LPSO are more expensive in computation than the SIFT algorithm. From the comparison in Table 2: ROEWA gives less performance for scale change images; RAGIH gives less performance other than rotation and scale change; HiSTDO gives less performance other than background clutter, illumination, and noise; DERF gives less performance other than scale and rotation change; DaLI gives less performance other than image deformations and illumination; DOG–ADTCP and LPSO give less performance other than rotation and illumination change when compared to all imaging conditions, hence it is not considering for further analysis in this research. Binary Robust Independent Elementary Features (BRIEF) requires less complexity than SIFT with the same matching performance but less performance for different rotation conditions. The fast retina keypoint (FREAK) descriptor is faster to compute with lesser memory load and is mostly used in embedded applications.
In recent times, deep learning feature extraction methods have been developed to train the images to provide feature points that are robust to distortion. Even though they have a specific degree of scale invariance by learning, they fail in cases with a significant change in scale [34,35].
The feature point detection and descriptor methods discussed above fail to give up the best results in all imaging transformation conditions. This research work aims to extract a robust feature detector for different imaging conditions, improve localization accuracy and match features accurately to recognize an object in an image. Hence, the feature detectors like SIFT, Bi-SIFT, SURF, STAR, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace, and descriptors like SIFT, SURF, ORB, BRISK, BRIEF, FREAK, Bi-SIFT, Tri-SIFT are used to prove the effectiveness of the proposed WGHCFD due to the robustness for various image transformation.

3. Proposed Vision System for Visually Impaired People

This section presents the proposed vision system for visually impaired people using an object recognition module that utilizes the proposed WGHCFD detector to recognize an object.
The proposed system uses an IoT-based machine vision system with the proposed WGHCFD feature detector for visually impaired people. Figure 1 shows the block diagram of the proposed vision system, and it has a major module of power supply, camera, ultrasonic sensor (HC-SR04), Raspberry Pi 3, object recognition module, and headphones. In the proposed system, ultrasonic sensors are used to detect the obstacle and measure its distance. The Raspberry Pi 3 acts as a small computer and has USB ports for connecting devices. The Pi camera has a fixed lens to capture high-definition images, which makes object identification easier. The captured images are processed by the object recognition module to recognize an object and can be communicated via headphones to improve the safe navigation of visually impaired people.

3.1. Object Recognition Module

The object recognition module in the proposed system recognizes several types of objects like chairs, people, tables etc. The module uses Pi camera module v2, and it is attached to the front side of the prototype to capture the image from the environment. The captured image is a fixed resolution of 720 × 680 pixels and this image is sent to the feature extractor for recognizing an object. In the proposed vision system, WGHCFD is utilized to extricate the features of the captured images, and the overall process of the feature extraction using WGHCFD is discussed in this section.
The working process of the object recognition module is shown in Figure 2. The captured image from the smart walking stick is given to the WGHCFD detector to extract features that are matched using similarity distance measurement to recognize an object even if it has various image transformations.

Feature Extraction Using Proposed WGHCFD Detector

In this section, four different stages of the proposed WGHCFD feature extraction method are explained briefly for the invariant object recognition task. The proposed WGHCFD method follows four stages and it is shown in Figure 3.
The proposed WGHCFD method consists of four stages and each of these is explained in detail below:
  • Stage 1: The Weighted Guided scale space construction (WGss)
Smooth an input image using Weighted Guided Filter (WGF) [36,37] that reduces the halo artifacts in an image and preserves the sharp gradient information. Hence, WGF is used to smooth an input image I(p, q). A guide image (G) and the variance of G ( σ G 2 ) in the 3 × 3 window Ω are assigned. An edge-aware weight of G is defined in Equation (1):
W G = 1 N i = 1 N σ G 2 + ϵ σ G 2 + ε
where ε is a constant. The W G is larger than 1 for an edge point and smaller than 1 for a smooth area. More weight is given to sharp edges than pixels in the smooth area. The weighting equation is introduced to obtain the weighted guided image filter using Equation (2):
W G = i Ω [ ( a i G ( i ) + b i I ( i ) ) 2 + λ W G a i 2 ]
where λ is a regularization parameter. The value of ai′ and bi′ are computed by Equation (3):
a i = μ ( G I ) μ ( G ) μ ( I ) σ G 2 + λ W G b i = μ ( I ) a i μ ( G )
The symbol ⊙ is an element-by-element product of G and I. μ(G), μ(I) and μ(GI) represent the mean values of G, I, and GI, respectively.
The scale-invariant feature is determined by searching all consecutive scales. The input image I(p, q) is convolved with weighted guided filter WG(p, q) to obtain a smoothed image L(p, q, σ) by using Equation (4):
L ( p , q , σ ) = W G ( p , q , σ ) I ( p , q )
where σ denotes the standard deviation of the weighted guided filter. Stable feature point locations are determined using scale-space extrema for various scales. The Laplacian of Weighted (LoW) guided filter detects feature point locations in scale space using Equation (5):
L o W   ( p , q , σ ) = L ( p , q )   ( W G ( p , q , σ ) )
where L(p, q) is a fixed Laplacian filter. The local extrema points are obtained using LoW.
  • Stage 2: Feature point Localization
Once the extrema points are detected, eliminate the low-contrast points to get a stable feature point. The low-contrast threshold parameter (0.03) eliminates low-contrast feature points as described in [4]. These low-contrast points are generated due to noise and smoothing filters, and unwanted low-contrast points are eliminated by calculating the blur metric value.
The blur metric value is calculated using the Haar wavelet transform [38]. By using edge sharpness analysis, the blur extent occurrence in the given image can be identified. The following steps calculates the blur metric value:
  • Step 1: Perform level 3 decomposition of the input image using Haar wavelet transform.
  • Step 2: Construct an edge map for each scale using Equation (6):
    E map   ( k , l ) = L H i 2 + H L i 2 + H H i 2   ( i = 1 , 2 , 3 )
  • Step 3: Partition the edge maps and find local maxima Emaxi (i = 1, 2, 3).
If Emaxi (k, l) > threshold then (k, l) is an Edge point.
  • Step 4: Calculate blur metric value.
    Blur   metric = E m a x 1 ( k , l ) < t h r e s h o l d { E m a x 1 ( k , l ) < E m a x 2 ( k , l ) < E m a x 3 ( k , l ) } + { ( E m a x 2 ( k , l ) > E m a x 1 ( k , l ) ) a n d   ( E m a x 2 ( k , l ) > E m a x 3 ( k , l ) ) }
By using Equation (7) the blur metric value is calculated to eliminate a low contrast point instead of the 0.03 value. The threshold value has been chosen using Equation (8):
Threshold value = Blur metric(I)
The feature point below the threshold value is considered as a low contrast point and has been eliminated. The edge response feature points are removed by comparing the second-order Taylor’s expansion of LoW at the corresponding feature point location against a certain threshold, and it has been eliminated to provide stable characteristics.
The average feature point detection using the SIFT detector and proposed blur metric calculation detector for Oxford datasets under different image conditions are shown in Figure 4.
The average feature point extracted using blur metric value is greater than the conventional SIFT algorithm as low contrast points are eliminated. The Harris corner points are detected for the input image I(p, q) and further added to the extrema point to obtain a stable and robust feature point [39].
  • Stage 3: Feature point Magnitude and Orientation Assignment
The gradient magnitude m(p, q) and orientation θ(p, q) are calculated using Equation (9) for the smooth image L(p, q):
m ( p , q ) = ( ( L ( p + 1 , q ) L ( p 1 , q ) ) 2 + ( L ( p , q + 1 ) L ( p , q 1 ) ) 2 θ ( p , q ) = tan 1 ( ( L ( p , q + 1 ) L ( p , q 1 ) ) / ( L ( p + 1 , q ) L ( p 1 , q ) ) )
An orientation histogram is formed to find the dominant direction of local gradients.
  • Stage 4: Generate Descriptor
After assigning the scale, location, and orientation of each feature point, the feature point descriptor is computed based on its local neighborhood and is invariant to image transformations. In SIFT, the descriptor is built up from an orientation histogram of the 4 × 4 array with 8 bins each. Finally, the descriptor consists of 128 feature elements. The performance of the WGHCFD detector proposed above is experimentally analyzed with standard datasets [29].

4. Qualitative Analysis of the Proposed WGHCFD

The dataset used in the experiments to evaluate the performance of the proposed system are explained in detail in this section. The proposed WGHCFD performances of repeatability and score matching are compared with the existing popular feature detectors and descriptors, are discussed in the following subsections.
The Oxford dataset [40] consists of two classes viz. structured or textured blur change images, two classes of viewpoint change images, two classes of zoom + rotation change images, one class of illumination change images, and one class of JPEG compression images. Figure 5 shows the example images from the Oxford dataset. The first image in each class is of good quality, whereas the remaining images in each class have small changes from the first image. The proposed WGHCFD performance was compared with the existing popular feature detectors SIFT, Bi-SIFT, SURF, STAR, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace, and descriptors SIFT, SURF, ORB, BRISK, BRIEF, FREAK, Bi-SIFT, and Tri-SIFT for repeatability and matching scores to prove the effectiveness.

4.1. Performance Measures

The repeatability score is used to measure the stability and accuracy of the feature point detector and is given in Equation (10). The repeatability criterion is calculated by the ratio of the corresponding feature points and the minimum total number of feature points detected in both images. Corresponding points between images enable the estimation of parameters describing geometric transforms between the images [4,41]:
r e p e a t a b i l i t y   s c o r e = c o r r e s p o n d e n c e s d e t e c t i o n
A matching score is used to measure the matching accuracy between two images and is given in Equation (11). The matching score is calculated by the ratio of the number of correct matching points and the total number of matching points [4,41]:
m a t c h i n g   s c o r e = n u m b e r   o f   c o r r e c t   m a t c h e s n u m b e r   o f   d e t e c t e d   m a t c h e s   i n   t h e   p a i r   o f   i m a g e s
The performance of the descriptors has been evaluated using the evaluation criteria of recall and 1-precision. The input image and test images are matched by the Euclidean distance of descriptors lower than a threshold, and each descriptor from the test image is compared with each descriptor in the test image. The results are achieved with recall and 1-precision. The Recall is the ratio of the number of correctly matched regions to the number of corresponding regions between two images of the same scene and is calculated using Equation (12):
r e c a l l = # c o r r e c t   m a t c h e s # n u m b e r   o f   c o r r e s p o n d e n c e s
The 1-precision is the ratio of the number of false matches to the total number of all matches and is represented by using Equation (13):
1 p r e c i s i o n = # f a l s e   m a t c h e s # c o r r e c t   m a t c h e s + # f a l s e   m a t c h e s
A good descriptor would give recall as 1 for any precision. Recall increases for an increasing threshold and distance between similar descriptors. The curves represent that the recall is attained with high precision [41].

4.2. Experimental Result Analysis

The following subsections discuss the experimental results from various image transformations such as blur change, viewpoint change, zoom+rotation change, Illumination change and JPEG compression images on the Oxford dataset, the performance of the proposed system is also analyzed in the object recognition task on a real-time dataset.

4.2.1. Blur Change Images

The repeatability and matching scores are calculated for existing detectors like SIFT, Bi-SIFT, SURF, STAR, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace and proposed WGHCFD for variant blur image in the datasets. Figure 6a,b shows the repeatability performance of the SIFT, Bi-SIFT, SURF, STAR, KAZE, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace, and proposed WGHCFD for the bikes blurred dataset images [38]. The average percentage of repeatability score of the detectors SIFT, Bi-SIFT, SURF, STAR, KAZE, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace and proposed WGHCFD are 32.49%, 38.75%, 61.72%, 49.46%, 63.58%, 41.2%, 52.45%, 55.05%, 51.01%, 60.05%, and 89.12%, respectively, for bikes dataset as shown in Figure 6a. The repeatability score of the detectors such as SIFT, Bi-SIFT, SURF, STAR, KAZE, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace, and proposed WGHCFD detectors are 30.54%, 49.02%, 51.70%, 48.39%, 55.16%, 39.67%, 41.87%, 40.33%, 24.08%, 33.25% and 65.48%, respectively, for trees image in Figure 6b. WGHCFD gives the average repeatability score of 40.08%, 34.22%, 11.25%, 23.51%, 9.39%, 20.52%, 17.92%, 31.77%, 21.90% and 12.92% higher than SIFT, Bi-SIFT, SURF, STAR, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace and Hessian–Laplace, respectively, for blur change bike datasets.
WGHCFD gives the average repeatability score of 44.12%, 16.25%, 13.78%, 17.09%, 10.32%, 25.81%, 23.61%, 25.15%, 41.01% and 32.23% higher than SIFT, Bi-SIFT, SURF, STAR, KAZE, BRISK, Hessian–Laplace and Harris–Laplace detectors, respectively, for blur change trees datasets. It is observed from Figure 6c that the SIFT, Bi-SIFT, SURF, STAR, KAZE, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace, and proposed WGHCFD detector performs with an average matching score of 46.20%, 59.21%, 62.27%, 56.26%, 67.70%, 51.51%, 49.38%, 65.78%, 41.77%, 44.00%, and 75.79%, respectively, for the blur change images of bike. From Figure 6d the matching score of SIFT, Bi-SIFT, SURF, STAR, KAZE, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, Hessian–Laplace and proposed WGHCFD is 46.20%, 61.43%, 60.87%, 54.86%, 66.18%, 63.51%, 49.58%, 52.00%, 51.77%, 46.00% and 71.27%, respectively. WGHCFD gives the matching score of 41.36%, 10.56%, 13.29%, 12.31%, 7.87%, 38.19%, 45.79%, and 43.57% higher than SIFT, Bi-SIFT, SURF, STAR, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, and Hessian–Laplace, respectively, for blur change images of bike. WGHCFD gives the matching score of 40.48%, 18.26%, 16.41%, 20.03%, 12.99%, 11.11%, 10.91%, 13% and 15.69% higher than SIFT, Bi-SIFT, SURF, STAR, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively, for blur change images of trees. The proposed WGHCFD gives higher repeatability and matching score than existing algorithms such as SIFT, Bi-SIFT, SURF, STAR, BRISK, ORB, STAR, KAZE, Tri-SIFT, Harris–Laplace, and Hessian–Laplace, as the number of feature points detected by WGHCFD is greater than the other detectors because when the blur is increased, the edges are retained efficiently from the scene.
The performance of the descriptors is evaluated based on the criteria recall and 1-precision for blur change images of bikes and trees using Equations (12) and (13). Figure 7 shows the evaluation of descriptors such as SIFT, SURF, ORB, BRISK, BRIEF, FREAK, Bi-SIFT, Tri-SIFT., and proposed WGHCFD. The results show that all descriptors are affected by blur change images. The proposed WGHCFD, Bi-SIFT, Tri-SIFT, and SIFT give high scores compared to other descriptors like SURF, ORB, BRIEF, BRISK, and FREAK as SIFT-based variant descriptors are robust to blur change images. The proposed WGHCFD gives a high score for blurred images compared with all other descriptors due to the largest number of matches.

4.2.2. Viewpoint Change Images

From Figure 8a, WGHCFD gives the average higher repeatability score of 39.08%, 26.69%, 18.99%, 20.72%, 15.38%, 11.18%, 15.15%, 8.89%, 20.03% and 22.75% than the SIFT, Bi-SIFT, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively, for viewpoint change images of graffiti.
WGHCFD performs an average higher repeatability score of 27.27%, 8.62%, 13.98%, 15.11%, 11.37%, 19.87%, 27.28%, and 16.87% than SIFT, Bi-SIFT, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively, for bricks dataset images as shown in Figure 8b. From Figure 8c, it is observed that WGHCFD gives the matching score of 33.20%, 27.70%, 19.50%, 22.93%, 16.94%, 10.58%, 9.58%, 9.45%, 18.50% and 23.90% higher than SIFT, Bi-SIFT, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively. WGHCFD performs the matching score of 24.12%, 22.63%, 23.84%, 19.06%, 17.47%, 15.91%, 10.11%, 9.45%, 21.45% and 24.49% higher than SIFT, Bi-SIFT, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively, as shown in Figure 8d. The proposed WGHCFD performs better for viewpoint change images than existing algorithms due to the invariance property of the Harris corner point detector.
The performance of the WGHCFD descriptor is evaluated based on recall vs. 1-precision on the other descriptors like SIFT, SURF, ORB, BRISK, BRIEF, FREAK, Bi-SIFT, Tri-SIFT, and the results are shown in Figure 9. The performance of the descriptors gradually increases for all descriptors. The SURF descriptor gives less performance compared to ORB, BRISK, BRIEF, and FREAK. The WGHCFD gives good performance compared to Bi-SIFT, Tri-SIFT, and SIFT due to Harris corner detector combined to give robustness to viewpoint change images.

4.2.3. Zoom+Rotation Change Images

The evaluations carried out for the zoom+rotation change images of dataset images are shown in Figure 5c. Figure 10a–d shows the repeatability score and matching score of SIFT, Bi-SIFT, GBHCD, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace for bark and boat images. WGHCFD gives the average higher repeatability score of 34.39%, 17.45%, 21.90%, 22.23%, 19.69%, 21.57%, 19.37%, 8.34%, 21.45% and 24.79% and 32.45%, 17.50%, 21.89%, 19.42%, 18.08%, 9.56%, 11.76%, 10.23%, 20.34% and 21.33% than SIFT, Bi-SIFT, SURF, STAR, BRISK, KAZE, ORB, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively, for bark and boat images.
WGHCFD gives the average higher matching score of 30.56%, 24.86%, 22.87%, 24.09%, 20.90%, 21.74%, 18.74%, 13.23%, 23.45% and 24.31% and 33.82%, 15.32%, 10.13%, 11.35%, 10.16%, 11.00%, 14.60%, 22.67%, 11.78%, 20.45% and 25.38% than SIFT, Bi-SIFT, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively, for dataset images as shown in Figure 10c,d. The proposed WGHCFD gives higher repeatability and matching score compared to other detectors since a greater number of feature points are extracted and matched correctly for the zoom+rotation change images.
The performance of the proposed WGHCFD descriptor is evaluated based on recall vs. 1-precision SIFT, SURF, ORB, BRISK, BRIEF, FREAK, Bi-SIFT, and Tri-SIFT descriptors and the results are shown in Figure 11. The proposed WGHCFD gives a high recall value compared to other descriptors such as SIFT, SURF, ORB, BRISK, BRIEF, FREAK, Bi-SIFT, and Tri-SIFT. WGHCFD gives high performance for zoom+rotation change images due to the rotation invariance property.

4.2.4. Illumination Change Images

Figure 12a,b shows the repeatability and matching score for illumination change images in datasets [38]. WGHCFD gives higher repeatability score of 37.94%, 16.96%, 20.85%, 18.07%, 13.24%, 18.92%, 15.52%, 14.23%, 20.34% and 25.69% and matching score of 32.80%, 24.50%, 18.31%, 21.53%, 14.14%, 16.18%, 10.18%, 8.32%, 23.54% and 24.35% than SIFT, Bi-SIFT, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively, for cars dataset images. The pro-posed WGHCFD performs better than existing algorithms due to low contrast elimination using blur metric calculation.
The performance of the descriptor is performed for illumination change images and is shown in Figure 13. SIFT, Bi-SIFT, Tri-SIFT, and proposed WGHCFD give the largest performance compared to SURF, ORB, BRISK, BRIEF, and FREAK due to robustness against illumination conditions.

4.2.5. JPEG Compression Images

Figure 14a,b shows the repeatability and matching score for the various test images in the datasets [40]. WGHCFD gives better repeatability scores of 35.37%, 18.45%, 20.68%, 24.10%, 25.51%, 13.73%, 15.55%, 8.34%, 24.67% and 26.72% and matching scores of 30.96%, 19.06%, 20.07%, 23.69%, 24.10%, 13.72%, 15.94%, 10.34%, 24.12% and 29.11% than SIFT, Bi-SIFT, SURF, STAR, ORB, KAZE, BRISK, Tri-SIFT, Harris–Laplace and Hessian–Laplace detectors, respectively. The proposed WGHCFD provides good results compared to existing detectors as it extracts more stable points.
Figure 15 shows the evaluation performance of SIFT, SURF, ORB, BRISK, BRIEF, FREAK, Bi-SIFT, Tri-SIFT, and proposed WGHCFD descriptors for JPEG compression images. The performance gradually increases as all descriptors are affected by JPEG artifacts. Proposed WGHCFD obtain good results compared to SIFT, Bi-SIFT, SURF, Tri-SFT, ORB, BRISK, BRIEF, and FREAK since it is more stable toward the artifacts problem.
From the feature detector experimental analysis, the Hessian and Harris–Laplace detector provides lower repeatability and matching scores for blurred images since the feature point detectors are not scale-invariant compared to other detectors. SURF detector offers results near the SIFT detector and high speed compared to SIFT but achieves less repeatability and matching score for all type of image conditions. STAR and BRISK feature detectors achieve less performance for all types of image transformations compared to the proposed WGHCFD. KAZE gives a better score than SIFT; however, the KAZE features are computationally more expensive. Bi-SIFT and Tri-SIFT detectors give moderate results for all image transformations.
From the descriptors evaluation analysis, the proposed WGHCFD achieves good performance when compared to SIFT, Bi-SIFT, SURF, Tri-SFT, ORB, BRISK, BRIEF, and FREAK for all image transformations. SURF gives less performance compared to SIFT descriptors. ORB provides good performance for rotation change images but not for other transformations. BRISK, BRIEF, and FREAK give less performance for blur change images. The experimental results obtained from the analysis confirmed that the WGHCFD provides an improved repeatability score, and the matching score and is also robust to various image transformations such as scale, viewpoint angle, zoom+rotation, illumination, and compression.

4.3. WGHCFD for Object Recognition Applications

The performance of the proposed system is analyzed in the domain of object recognition on a real-time dataset that has 101 object categories and each object has 10 significant variations in scale, viewpoint, color, pose, and illumination changes. Figure 16a shows some of the real-time dataset objects. The WGHCD features are extracted from the input dataset image. For example, Figure 16b shows the image that has to be recognized in the dataset. The objects in the images are recognized correctly and it is shown in Figure 16c.
The recognition rate of this dataset is calculated by the ratio between the number of correctly recognized images and the total number of images. Figure 17 shows the comparison of recognition rate of the proposed WGHCFD method with the existing descriptors on the real time dataset of 1010 images. From the results shown in Table 3, the proposed WGHCFD has the best recognition rate in this experiment (99.8%) followed by Tri-SFT (96%), DOG–ADTCP (95%), Bi-SIFT (95%), FREAK (92%), SURF (92%), BRIEF (89%), BRISK (88%), SIFT (83%) and ORB (76.9%). The results also presented the number of images recognized by each method with their recognition rate from a total of 1010 images on 101 object categories dataset.
From the experimental analysis, the WGHCFD achieves a high recognition rate and is also invariant to various image transformations when compared to the existing feature detectors and descriptors. The recognition rate of WGHCFD achieves 99.8%, hence it is more suitable for the object recognition task and supports visually impaired people for safe navigation.

5. Conclusions and Future Work

This paper proposes a WGHCFD features extraction method under different real-world conditions such as scale, rotation, illumination, and viewpoint changes. WGHCFD uses a weighted guided image filter to smooth an image to reduce halo artifacts. The Haar wavelet transform provides blur metric values to eliminate the unstable feature points; Harris corner points in WGHCFD give more robust feature points. WGHCFD achieves higher repeatability and matching scores compared to existing feature extraction methods. Eliminating vision problems in visually impaired persons, the proposed work can be used in recognizing the objects with various transformations like scale, rotation, viewpoint, and illumination. A robust WGHCFD feature detector with high recognition accuracy supports visually impaired people for safe navigation. In the future, the proposed work can be extended for use in the mobile phone system for better navigation of visually impaired persons.

Author Contributions

Conceptualization, M.R., P.S., T.S., S.A. and H.K.; methodology, M.R. and P.S.; software, M.R., P.S. and T.S.; validation, M.R., P.S., T.S., S.A. and H.K.; formal analysis, M.R. and P.S.; investigation, M.R., P.S. and T.S.; resources, H.K.; data curation, M.R. and S.A.; writing—original draft preparation, M.R. and P.S.; writing—review and editing, M.R., P.S., T.S., S.A. and H.K.; visualization, M.R., P.S. and T.S.; supervision, T.S., S.A. and H.K.; project administration, H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2017R1D1A1B04032598).

Institutional Review Board Statement

Not Applicable.

Informed Consent Statement

Not Applicable.

Data Availability Statement

The datasets used in this paper are publicly available and their links are provided in the reference section.

Acknowledgments

We thank the anonymous reviewers for their valuable suggestions that improved the quality of this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shanthi, K.G. Smart Vision using Machine learning for Blind. Int. J. Adv. Sci. Technol. 2020, 29, 12458–12463. [Google Scholar]
  2. Rahman, M.A.; Sadi, M.S. IoT Enabled Automated Object Recognition for the Visually Impaired. Comput. Methods Programs Biomed. Update 2021, 1, 100015. [Google Scholar] [CrossRef]
  3. Afif, M.; Ayachi, R.; Said, Y.; Pissaloux, E.; Atri, M. An Evaluation of RetinaNet on Indoor Object Detection for Blind and Visually Impaired Persons Assistance Navigation. Neural Process. Lett. 2020, 51, 2265–2279. [Google Scholar] [CrossRef]
  4. Lowe, D.G. Distinctive image features from scale-invariant interest points. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
  5. Zhang, J.; Chen, G.; Jia, Z. An image stitching algorithm based on histogram matching and SIFT algorithm. Int. J. Pattern Recognit. Artif. Intell. 2017, 31, 1754006. [Google Scholar] [CrossRef] [Green Version]
  6. Arth, C.; Bischof, H. Real-time object recognition using local features on a DSP-based embedded system. J. Real-Time Image Process. 2008, 3, 233–253. [Google Scholar] [CrossRef]
  7. Zhou, H.; Yuan, Y.; Shi, C. Object tracking using SIFT features and mean shift. Comput. Vis. Image Underst. 2009, 113, 345–352. [Google Scholar] [CrossRef]
  8. Sirmacek, B.; Unsalan, C. Urban area and building detection using SIFT keypoints and graph theory. IEEE Trans. Geosci. Remote Sens. 2009, 47, 1156–1167. [Google Scholar] [CrossRef]
  9. Chang, L.; Duarte, M.M.; Sucar, L.E.; Morales, E.F. Object class recognition using SIFT and Bayesian networks. Adv. Soft Comput. 2010, 6438, 56–66. [Google Scholar]
  10. Soni, B.; Das, P.K.; Thounaojam, D.M. Keypoints based enhanced multiple copy-move forgeries detection system using density-based spatial clustering of application with noise clustering algorithm. IET Image Process. 2018, 12, 2092–2099. [Google Scholar] [CrossRef]
  11. Lodha, S.K.; Xiao, Y. GSIFT: Geometric scale invariant feature transform for terrain data. Int. Soc. Opt. Photonics 2006, 6066, 60660L. [Google Scholar]
  12. Ke, Y.; Sukthankar, R. PCA-SIFT: A more distinctive representation for local image descriptors. CVPR 2004, 4, 506–513. [Google Scholar]
  13. Abdel-Hakim, A.E.; Farag, A.A. CSIFT: A SIFT descriptor with color invariant characteristics. Comput. Vis. Pattern Recognit. 2006, 2, 1978–1983. [Google Scholar]
  14. Morel, J.M.; Yu, G. ASIFT: A new framework for fully affine invariant image comparison. SIAM J. Imaging Sci. 2009, 2, 438–469. [Google Scholar] [CrossRef]
  15. Mikolajczyk, K.; Schmid, C. A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1615–1630. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Rosten, E.; Drummond, T. Machine learning for high-speed corner detection. In Proceedings of the European Conference on Computer Vision (ECCV), Graz, Austria, 7–13 May 2006; pp. 430–443. [Google Scholar]
  17. Agrawal, M.; Konolige, K.; Blas, M.R. Censure: Center surround extremas for realtime feature detection and matching. In Proceedings of the European Conference on Computer Vision, Marseille, France, 12–18 October 2008; pp. 102–115. [Google Scholar]
  18. Bay, H.; Andreas, E.; Tuytelaars, T.; Van Gool, L. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
  19. Alcantarilla, P.F.; Bartoli, A.; Davison, A.J. KAZE features. In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; pp. 214–227. [Google Scholar]
  20. Leutenegger, S.; Chli, M.; Siegwart, R.Y. BRISK: Binary robust invariant scalable keypoints. In Proceedings of the IEEE International Conference on Computer Vision (ICCV 2011), Barcelona, Spain, 6–13 November 2011; pp. 2548–2555. [Google Scholar]
  21. Rublee, M.E.; Rabaud, V.; Konolige, K.; Bradski, G.R. ORB: An efficient alternative to SIFT or SURF. ICCV 2011, 11, 2. [Google Scholar]
  22. Huang, M.; Mu, Z.; Zeng, H.; Huang, H. A novel approach for interest point detection via Laplacian-of-bilateral filter. J. Sens. 2015, 2015, 685154. [Google Scholar] [CrossRef]
  23. Şekeroglu, K.; Soysal, O.M. Comparison of SIFT, Bi-SIFT, and Tri-SIFT and their frequency spectrum analysis. Mach. Vis. Appl. 2017, 28, 875–902. [Google Scholar] [CrossRef]
  24. Ghahremani, M.; Liu, Y.; Tiddeman, B. FFD: Fast Feature Detector. IEEE Trans. Image Process. 2021, 30, 1153–1168. [Google Scholar] [CrossRef]
  25. Calonder, M.; Lepetit, V.; Strecha, C.; Fua, P. BRIEF: Binary robust independent elementary features. In Proceedings of the European Conference on Computer Vision, Heraklion, Crete, Greece, 5–11 September 2010. [Google Scholar]
  26. Alahi, A.; Ortiz, R.; Vandergheynst, P. Freak: Fast retina keypoint. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2012, Providence, RI, USA, 16–21 June 2012; pp. 510–517. [Google Scholar]
  27. Simo-Serra, E.; Torras, C.; Moreno-Noguer, F. DaLI: Deformation and Light Invariant Descriptor. Int. J. Comput. Vis. 2015, 115, 115–136. [Google Scholar] [CrossRef]
  28. Weng, D.W.; Wang, Y.H.; Gong, M.M.; Tao, D.C.; Wei, H.; Huang, D. DERF: Distinctive efficient robust features from the biological modeling of the P ganglion cells. IEEE Trans. Image Process. 2015, 24, 2287–2302. [Google Scholar] [CrossRef] [PubMed]
  29. Kim, W.; Han, J. Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes. Mach. Vis. Appl. 2017, 28, 49–59. [Google Scholar] [CrossRef]
  30. Sadeghi, B.; Jamshidi, K.; Vafaei, A.; Monadjemi, S.A. A local image descriptor based on radial and angular gradient intensity histogram for blurred image matching. Vis. Comput. 2019, 35, 1373–1391. [Google Scholar] [CrossRef]
  31. Yu, Q.; Zhou, S.; Jiang, Y.; Wu, P.; Xu, Y. High-Performance SAR Image Matching Using Improved SIFT Framework Based on Rolling Guidance Filter and ROEWA-Powered Feature. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 920–933. [Google Scholar] [CrossRef]
  32. DeTone, D.; Malisiewicz, T.; Rabinovich, A. Superpoint: Selfsupervised interest point detection and description. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 224–236. [Google Scholar]
  33. Jingade, R.R.; Kunte, R.S. DOG-ADTCP: A new feature descriptor for protection of face identification system. Expert Syst. Appl. 2022, 201, 117207. [Google Scholar] [CrossRef]
  34. Yang, W.; Xu, C.; Mei, L.; Yao, Y.; Liu, C. LPSO: Multi-Source Image Matching Considering the Description of Local Phase Sharpness Orientation 2022. IEEE Photonics J. 2022, 14, 7811109. [Google Scholar] [CrossRef]
  35. Dusmanu, M. D2-net: A trainable CNN for joint description and detection of local features. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 8092–8101. [Google Scholar]
  36. Li, Z.; Zheng, J.; Zhu, Z.; Yao, W.; Wu, S. Weighted guided image filtering. IEEE Trans. Image Process. 2015, 24, 120–129. [Google Scholar]
  37. He, K.; Sun, J.; Tang, X. Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1397–1409. [Google Scholar] [CrossRef]
  38. Tong, H.; Li, M.; Zhang, H.; Zhang, C. Blur detection for digital images using wavelet transform. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2004), Taipei, Taiwan, 27–30 June 2004; pp. 17–20. [Google Scholar]
  39. Harris, C.; Stephens, M. A combined corner and edge detection. In Proceedings of the fourth alvey vision conference (UK, 1988), Manchester, UK, 31 August–2 September 1988; pp. 147–151. [Google Scholar]
  40. Mikolajczyk, K. Oxford Data Set. Available online: http://www.robots.ox.ac.uk/~vgg/research/affine (accessed on 24 May 2022).
  41. Schmid, C.; Mohr, R.; Bauckhage, C. Evaluation of interest point detectors. Int. J. Comput. Vis. 2000, 37, 151–172. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Block diagram of the proposed vision system.
Figure 1. Block diagram of the proposed vision system.
Sustainability 14 09063 g001
Figure 2. Working flow of the object recognition module.
Figure 2. Working flow of the object recognition module.
Sustainability 14 09063 g002
Figure 3. Block diagram of 4 stages of the proposed WGHCFD method.
Figure 3. Block diagram of 4 stages of the proposed WGHCFD method.
Sustainability 14 09063 g003
Figure 4. Average feature point detection using blur metric value.
Figure 4. Average feature point detection using blur metric value.
Sustainability 14 09063 g004
Figure 5. Oxford data sets: (a) blur change image; (b) viewpoint change image; (c) zoom+rotation change image; (d) illumination change image; (e) jpeg compression image.
Figure 5. Oxford data sets: (a) blur change image; (b) viewpoint change image; (c) zoom+rotation change image; (d) illumination change image; (e) jpeg compression image.
Sustainability 14 09063 g005
Figure 6. Blur change images: (a) Repeatability score of bikes; (b) Repeatability score of trees; (c) Matching score of bikes; (d) Matching score of trees.
Figure 6. Blur change images: (a) Repeatability score of bikes; (b) Repeatability score of trees; (c) Matching score of bikes; (d) Matching score of trees.
Sustainability 14 09063 g006aSustainability 14 09063 g006b
Figure 7. Blur change images—recall vs. 1-precision.
Figure 7. Blur change images—recall vs. 1-precision.
Sustainability 14 09063 g007
Figure 8. Viewpoint change images: (a) Repeatability score of graffiti; (b) Repeatability score of bricks; (c) Matching score of graffiti; (d) Matching score of bricks.
Figure 8. Viewpoint change images: (a) Repeatability score of graffiti; (b) Repeatability score of bricks; (c) Matching score of graffiti; (d) Matching score of bricks.
Sustainability 14 09063 g008aSustainability 14 09063 g008b
Figure 9. Viewpoint change images—recall vs. 1-precision.
Figure 9. Viewpoint change images—recall vs. 1-precision.
Sustainability 14 09063 g009
Figure 10. Zoom+rotation change images: (a) Repeatability score of bark; (b) Repeatability score of boat; (c) Matching score of bark; (d) Matching score of boat.
Figure 10. Zoom+rotation change images: (a) Repeatability score of bark; (b) Repeatability score of boat; (c) Matching score of bark; (d) Matching score of boat.
Sustainability 14 09063 g010
Figure 11. Zoom+rotation change images—recall vs. 1-precision.
Figure 11. Zoom+rotation change images—recall vs. 1-precision.
Sustainability 14 09063 g011
Figure 12. Illumination change images: (a) Repeatability score of cars; (b) Matching score of cars.
Figure 12. Illumination change images: (a) Repeatability score of cars; (b) Matching score of cars.
Sustainability 14 09063 g012aSustainability 14 09063 g012b
Figure 13. Illumination change images—recall vs. 1-precision.
Figure 13. Illumination change images—recall vs. 1-precision.
Sustainability 14 09063 g013
Figure 14. JPEG compression images: (a) Repeatability score of ubc; (b) Matching score of ubc.
Figure 14. JPEG compression images: (a) Repeatability score of ubc; (b) Matching score of ubc.
Sustainability 14 09063 g014
Figure 15. JPEG compression image—recall vs. 1-precision.
Figure 15. JPEG compression image—recall vs. 1-precision.
Sustainability 14 09063 g015
Figure 16. (a) Sample objects in real-time dataset; (b) Input image for object recognition; (c) Recognized image.
Figure 16. (a) Sample objects in real-time dataset; (b) Input image for object recognition; (c) Recognized image.
Sustainability 14 09063 g016
Figure 17. Comparison of recognition rate of proposed WGHCFD with the existing descriptors.
Figure 17. Comparison of recognition rate of proposed WGHCFD with the existing descriptors.
Sustainability 14 09063 g017
Table 1. Overview of Feature Detectors and Invariance Properties.
Table 1. Overview of Feature Detectors and Invariance Properties.
DetectorsFeature Detector TypeScale-InvariantRotation Invariant
Harris–Laplace (2004)Corner + Partial blobYesYes
Hessian–Laplace (2004)Partial corner + BlobYesYes
SIFT (2004)Partial corner + BlobYesYes
FAST (2006)CornerNoYes
STAR (2008)BlobYesYes
ORB (2011)CornerYesYes
BRISK (2011)CornerYesYes
Bi-SIFT (2015)BlobYesYes
Tri-SIFT (2017)BlobYesYes
FFD (2021)BlobYesNo
Table 2. Comparison of Popular Feature Descriptors.
Table 2. Comparison of Popular Feature Descriptors.
MethodsScale-SpaceOrientationDescriptor GenerationFeature SizeRobustness
SIFT (2004)DoGLocal gradient histogramLocal gradient histogram128Brightness, contrast, rotation, scale, affine transforms, noise
GLOH (2005)NoNoLog-polar location grid in radial direction128Brightness, contrast, rotation, scale, affine transforms, noise
SURF (2006)Box filterHaar wavelet responsesHaar wavelet responses64Scale, rotation, illumination, noise
BRIEF (2010)NoNoIntensity comparison of pixel pairs in the random sampling pattern using a Gaussian distribution256Brightness, contrast
ORB (2011)Gaussian
image pyramid
Intensity calculationcentroid Oriented
BRIEF descriptor
256Brightness, contrast, rotation, limited scale
BRISK (2011)Gaussian
image pyramid
Average of the local gradientIntensity comparison
of pixels in concentric circles pattern
512Brightness, contrast, rotation, scale
FREAK (2012)NoAverage of the local gradientIntensity comparison of pixels in the retina sampling pattern256Brightness, contrast, rotation, scale, viewpoint
DaLI (2015)NoNoHeat Kernel Signature128Non-rigid deformations and photometric changes
Bi-SIFT (2015)LoBLocal gradient histogramLocal gradient histogram128Brightness, contrast, rotation, scale, affine transforms, noise
DERF (2015)Convolve gradient maps
of grid points using DoG
DoG convoluted gradient
orientation maps
Sampling the gradient
orientation maps
536Scale and rotation
HiSTDO (2016)NoThe dominant orientation and coherence based on the space-time
local gradient field
Normalized gradient-based histogramDepend
on image/video
Background clutter, brightness change and camera noise.
Tri-SIFT (2017)DoTLocal
gradient histogram
Local gradient
histogram
128Brightness, contrast, rotation, scale, affine transforms, noise
RAGIH (2018)NoNoRadial and angular gradient intensity histogram120Robustness to rotation and scale changes in a blurred image
ROEWA (2019)Nonlinear multiscale
space
Gradient histogramGLOH descriptor128Speckle noise and complex
deformation
DOG–ADTCP (2022)Modified DOG and Angle Directional Ternary Co-relation pattern (ADTCP)Local
gradient histogram
Local gradient
histogram
128Illumination and rotation
LPSO (2022)Gaussian spatial maximum moment mapPhase orientation feature calculationHistogram of phase sharpness orientation features200Illumination and rotation
Table 3. Number of recognized images and recognition rate from a total of 1010 images for 101 object categories.
Table 3. Number of recognized images and recognition rate from a total of 1010 images for 101 object categories.
MethodNumber of Recognized ImagesRecognition Rate (%)
SIFT838/101083
Bi-SIFT959/101095
SURF929/101092
Tri-SFT969/101096
ORB776/101076.9
BRISK888/101088
BRIEF898/101089
FREAK929/101092
DOG–ADTCP959/101095
Proposed WGHCFD1007/101099.8
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Rajendran, M.; Stephan, P.; Stephan, T.; Agarwal, S.; Kim, H. Cognitive IoT Vision System Using Weighted Guided Harris Corner Feature Detector for Visually Impaired People. Sustainability 2022, 14, 9063. https://doi.org/10.3390/su14159063

AMA Style

Rajendran M, Stephan P, Stephan T, Agarwal S, Kim H. Cognitive IoT Vision System Using Weighted Guided Harris Corner Feature Detector for Visually Impaired People. Sustainability. 2022; 14(15):9063. https://doi.org/10.3390/su14159063

Chicago/Turabian Style

Rajendran, Manoranjitham, Punitha Stephan, Thompson Stephan, Saurabh Agarwal, and Hyunsung Kim. 2022. "Cognitive IoT Vision System Using Weighted Guided Harris Corner Feature Detector for Visually Impaired People" Sustainability 14, no. 15: 9063. https://doi.org/10.3390/su14159063

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop