Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows

Matuzevičius, Dalius; Urbanavičius, Vytautas; Miniotas, Darius; Mikučionis, Šarūnas; Laptik, Raimond; Ušinskas, Andrius

doi:10.3390/electronics13112112

Open AccessArticle

Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows

by

Dalius Matuzevičius

^*

,

Vytautas Urbanavičius

,

Darius Miniotas

,

Šarūnas Mikučionis

,

Raimond Laptik

and

Andrius Ušinskas

Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), 10105 Vilnius, Lithuania

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(11), 2112; https://doi.org/10.3390/electronics13112112

Submission received: 30 April 2024 / Revised: 26 May 2024 / Accepted: 28 May 2024 / Published: 29 May 2024

(This article belongs to the Special Issue IoT-Enabled Smart Devices and Systems in Smart Environments)

Download

Browse Figures

Versions Notes

Abstract

:

Photogrammetry depends critically on the quality of the images used to reconstruct accurate and detailed 3D models. Selection of high-quality images not only improves the accuracy and resolution of the resulting 3D models, but also contributes to the efficiency of the photogrammetric process by reducing data redundancy and computational demands. This study presents a novel approach to image quality evaluation tailored for photogrammetric applications that uses the key point descriptors typically encountered in image matching. Using a LightGBM ranker model, this research evaluates the effectiveness of key point descriptors such as SIFT, SURF, BRISK, ORB, KAZE, FREAK, and SuperPoint in predicting image quality. These descriptors are evaluated for their ability to indicate image quality based on the image patterns they capture. Experiments conducted on various publicly available image datasets show that descriptor-based methods outperform traditional no-reference image quality metrics such as BRISQUE, NIQE, PIQE, and BIQAA and a simple sharpness-based image quality evaluation method. The experimental results highlight the potential of using key-point-descriptor-based image quality evaluation methods to improve the photogrammetric workflow by selecting high-quality images for 3D modeling.

Keywords:

image quality evaluation; photogrammetry; structure from motion (SfM); 3D reconstruction; learning to rank; ranking model

1. Introduction

Photogrammetry offers a sophisticated method to extract precise geometric information from photographs. This technology enables the creation of detailed three-dimensional (3D) models of objects [1,2,3,4] or scenes [5,6,7,8], transforming the way industries and sciences approach visualization and analysis [9,10,11,12,13,14,15]. Utilization of photogrammetry spans a diverse range of fields, underlining its significance in contemporary applications that demand accuracy and detail [16,17,18,19,20,21,22].

The process of photogrammetry involves several stages, beginning with the collection of images and followed by detailed image processing [23,24,25]. Techniques such as feature detection, matching, and stereo correspondence are fundamental in transforming two-dimensional data into three-dimensional forms [26,27,28]. This transformation is critical for structure from motion (SfM) algorithms, which depend heavily on the ability to identify corresponding points across multiple images to effectively reconstruct 3D structures [29,30,31,32]. Any degradation in image quality, such as blurring, poor exposure, or noise, can lead to increased errors in feature matching, subsequently affecting the entire reconstruction process and impacting the accuracy and precision of the resultant 3D models [33,34,35]. Therefore, at the heart of photogrammetry lies the crucial role of image quality.

Recent advancements in image pre-processing techniques have significantly enhanced the accuracy and quality of the 3D reconstructions in various photogrammetric applications. Studies have shown that efficient pipelines involving color enhancement, image denoising, and RGB-to-gray conversion can improve the performance of automated image processing tools, particularly in cases with poor texture or sub-optimal datasets [36,37,38]. These pre-processing methods not only enhance the automated orientation procedures but also contribute to the production of dense 3D point clouds and texture models, which are crucial for high-quality 3D reconstructions [37,39]. Additionally, research has indicated that JPEG compression ratios below 10 maintain near-lossless image quality, thus minimally affecting the accuracy of photogrammetric point determination [40]. Similarly, the application of SfM photogrammetry has underscored the importance of image quality, network configuration, and sensor optimization in generating accurate geospatial data [41].

Comparative analyses of photogrammetric workflows, particularly those employing UAV-based methods, have highlighted the need for optimized procedures to enhance the accuracy and repeatability of geospatial data products [42,43]. In challenging environments like alpine valleys and underwater settings, terrestrial photogrammetry and low-cost underwater photogrammetry have been validated as viable alternatives to more expensive methods such as terrestrial laser scanning [36,44]. This is achieved through careful image acquisition strategies and robust pre-processing steps that mitigate errors and improve data reliability. Furthermore, quality assessment techniques, including modulation transfer function analysis and reproducibility studies of UAS orthomosaics, have demonstrated that high degrees of accuracy and reproducibility are attainable, reinforcing the utility of photogrammetric methods in diverse environmental and research contexts [45,46]. These studies collectively underscore the critical role of image quality and pre-processing in the advancement of photogrammetric technologies.

These days, the ease of capturing images using smartphones, often through video recording, offers a convenient method for collecting data for photogrammetry [47,48,49,50]. However, videos typically contain numerous similar frames that may not contribute to increasing the model’s quality but could instead slow down the reconstruction process. Collecting images for 3D reconstruction can result in redundant images if the object or scene is photographed extensively (Figure 1). Therefore, it becomes essential to discern and filter out these unnecessary and lower-quality frames from the collected image set. This selection process requires robust image quality evaluation methods that do not depend on reference images given the impracticality of having perfect reference images in real-world scenarios.

Traditionally, image quality evaluation approaches are categorized into reference-based and reference-free assessments. While reference-based evaluations compare images against a perfect-quality reference, they are often not feasible in practical, real-world scenarios. Conversely, no-reference assessments are invaluable for photogrammetry as they judge the quality of an image based on its inherent characteristics, such as texture sharpness, exposure, and noise levels [51,52,53]. Technologies such as BRISQUE, NIQE, PIQE, and BIQAA are well-known examples of reference-free quality evaluation methods.

This study proposes a novel approach to image quality evaluation in photogrammetric workflows through the use of key point descriptors. Key point descriptors like SIFT (Scale-Invariant Feature Transform) [54,55], SURF (Speeded-Up Robust Features) [56], BRISK (Binary Robust Invariant Scalable Keypoints) [57], ORB (Oriented FAST and Rotated BRIEF) [58], KAZE [59], FREAK (Fast Retina Keypoint) [60], and SuperPoint [61] encapsulate robust and distinctive information about local features in images [62,63,64]. These descriptors are not only pivotal for object recognition and image matching, but also have the potential to indicate image quality based on the characteristics of the features they extract. By leveraging these descriptors, this research aims to enhance the process of image selection, ensuring that photogrammetry is performed with only the highest-quality images, thereby optimizing the accuracy and resolution of the final 3D models.

The principal objective of this research is to explore whether key point descriptors can autonomously predict image quality. We employ a machine learning approach, specifically, a LightGBM [65] ranker model, which is renowned for its efficiency in ranking tasks. This model is expected to provide a robust framework for predicting image quality using features derived from key point descriptors.

In this research, key point descriptors will be extracted from a diverse dataset of images, each characterized by varying levels of known distortions like motion and defocus blur and noise. These descriptors will then be utilized as features in a LightGBM ranker model trained to predict the quality of images based on these features. Moreover, the study will examine the influence of different types of key point descriptors on the model’s performance, assessing whether certain descriptors provide more predictive power than others.

The novelty and contributions of this work can be summarized as follows:

Proposing a method for image quality evaluation based on key point descriptors. The proposed method reuses the descriptors extracted for the purpose of feature matching, which is performed in the photogrammetric sparse reconstruction stage, thus minimizing the computational overhead of image quality evaluation;
Presenting a dataset construction guide for the development of a descriptor-based image quality evaluation method;
Performing comparative evaluation of seven descriptor types (SURF, SIFT, BRISK, ORB, KAZE, FREAK, SuperPoint) as feature sources used for image quality evaluation;
Presenting comparative results of image quality evaluation methods using five publicly available image datasets. Results show that the created method is effective and performs better than simple sharpness-based, BRISQUE, NIQE, PIQE, and BIQAA blind image quality metrics in choosing better-quality images and reducing image redundancy for photogrammetric reconstructions.

The motivation for and problem with choosing the highest-quality image for photogrammetric reconstruction are summarized in Figure 1.

The structure of the paper is organized as follows. Section 2 describes the proposed method for evaluating image quality using feature descriptors, offers a brief overview of other standard image quality evaluation methods, and details the data preparation and software tools employed. Section 3 presents a comparative analysis of image quality evaluation methods, interprets the findings, and discusses their practical implications. Section 4 concludes the paper by summarizing the outcomes of this research.

2. Materials and Methods

This section describes why evaluating the quality of images used in the photogrammetric reconstruction of 3D models is helpful. We propose a method that exploits feature descriptors extracted during a typical photogrammetry workflow. Additionally, we describe the process of experimental data preparation, as well as creation of the training and testing datasets, and then outline the evaluation process for image quality methods.

2.1. Motivation and Method for Finding Higher-Quality Images

To achieve precise and reliable 3D models, it is essential to select the highest-quality images for photogrammetric object reconstruction. High-quality images ensure superior detection of features, which is essential for accurate image matching and alignment, thereby minimizing errors during model construction. Image quality degradation simulations prove that key point localization errors increase as the image motion or defocus blur strengthens (see Figure 2 and Figure 3). Photogrammetric reconstruction depends on the key point detection. Hence, the reconstruction accuracy depends on key point localization accuracy. Therefore, selection of high-quality images improves the overall accuracy of reconstructed models, reduces computational requirements, and ensures the usability of the reconstructed object or scene in various applications, such as the preservation of cultural heritage, virtual reality, and industrial design. Thus, careful selection of images not only simplifies the workflow, but also significantly impacts the fidelity and utility of the resulting models.

Collecting images for 3D reconstruction often accumulates redundant images, especially when the object of interest is extensively photographed or filmed (see Figure 1). For optimal results, it is crucial that the images uniformly cover the area surrounding the object and maintain the highest quality possible. While using a smartphone to capture images by filming the object is convenient, this approach can lead to over-sampling, particularly if the camera moves slowly, and vice versa; fast camera movement increases the risk of capturing frames with motion blur, especially in low-light conditions. Additionally, accidental hand tremors or sudden movements can worsen the motion blur in certain frames. It is vital to identify and eliminate such degraded frames.

To address this issue, a method is needed that ranks frames based on factors affecting image quality, such as motion blur, defocus blur, and sensor noise. This method should prioritize images that show the least degradation, ensuring that the highest-quality images are selected for the photogrammetric reconstruction process.

2.1.1. Proposed Method

A suitable image selection method for photogrammetry would identify the highest-quality images by evaluating the images or specific regions within them. It would be practical to repurpose feature point descriptors, which are already extracted during the photogrammetric reconstruction process (Figure 4), to derive an image quality score. This score could then be used to rank a group of images based on their quality.

The process of photogrammetry typically involves several critical steps, of which the detection and description of key points (feature points) are fundamental components (see Figure 4) [66]. Key points are distinctive locations in the image, such as corners, edges, or blobs, where the underlying image structure changes significantly. Robust detection and accurate description of these key points are critical for aligning and stitching multiple images to create a coherent 3D model [67].

The identification of corresponding feature points between different images forms the backbone of the structure from motion (SfM) algorithm. SfM uses these corresponding points to infer the three-dimensional spatial relationships and orientations of the photographed object or scene. The primary output of the SfM process is a sparse reconstructed 3D point cloud that represents the salient features of the object or environment captured across the set of images.

The task of finding corresponding feature points essentially involves identifying similar image patches across different images. This is facilitated by image descriptors, which provide a compact representation of local image patches. A good descriptor encapsulates the crucial visual information surrounding a feature point, potentially including aspects of image quality such as sharpness, blurriness, and noise levels. By capturing these characteristics, the descriptors not only help to match corresponding points across images, but also implicitly convey information about the quality of the images used, which can significantly influence the accuracy of the 3D reconstruction.

There are a variety of key point detectors and descriptors, each with unique strengths and suitable for different types of images and applications [68,69,70,71]. In photogrammetry, these descriptors can be used individually or in combination depending on the specific requirements of the pipeline. Well-known feature point detectors and descriptors include SURF (Speeded-Up Robust Features) [56], SIFT (Scale-Invariant Feature Transform) [54,55], BRISK (Binary Robust Invariant Scalable Keypoints) [57], ORB (Oriented FAST and Rotated BRIEF) [58], KAZE [59], FREAK (Fast Retina Keypoint) [60], and SuperPoint [61]. These methods differ primarily in their computational efficiency, robustness to image transformations, and the types of image content they are best suited to handle [70,72,73,74]. MATLAB Image Processing Toolbox implementations of these key point detectors and descriptors, except for SuperPoint, are used in the experiments. The Python implementation of the SuperPoint key point detector and descriptor is used.

SURF [56] extracts features by first using a Hessian-matrix-based approach to detect interest points across multiple scales, facilitated by integral images for speed. The determinant of the Hessian is computed at various scales to identify potential key points. Once key points are identified, SURF assigns an orientation based on the sum of the Haar wavelet responses around the key point, ensuring rotation invariance. The descriptor is then formed from the sum of the Haar wavelet response in the cardinal directions within a circular neighborhood around the key point, resulting in a feature set that is robust against scale and orientation changes.

SIFT [54,55] feature extraction begins with the identification of scale-space extrema using the Difference-of-Gaussians (DoG) method applied across multiple scales. Key locations are refined to subpixel accuracy, and low-contrast or poorly localized points are discarded to enhance robustness. Orientation assignment to each key point is based on local image gradients, ensuring rotation invariance. The SIFT descriptor is then created from gradient histograms of image regions around each key point, providing distinctive features that are invariant to scale and rotation and partially to changes in illumination and viewpoint.

BRISK [57] first applies a fast scale-space detection of key points using a space of octaves (similar to SIFT) but with an accelerated and simplified scale determination method. Key points are identified based on a FAST-like algorithm, adjusted for scale and orientation. The BRISK descriptor itself is generated by forming a binary string based on intensity comparisons of pixel pairs around each key point, within a sampled pattern defined by concentric circles. This process results in a highly efficient binary descriptor that is both rotation and scale invariant.

ORB [58] combines the FAST key point detector and the BRIEF descriptor, but with significant modifications. ORB detects key points using a modified FAST detector that is tuned to be more robust to noise, and it scores key points based on the Harris corner measure to ensure that the points chosen are well distributed and stable. For the descriptor, ORB steers BRIEF descriptors based on the orientation of key points, making them rotation invariant. The resultant binary descriptor is efficient to compute and compare, making ORB suitable for real-time applications.

KAZE [59] features are extracted using a method that differs fundamentally from other descriptors by relying on a nonlinear scale space. Instead of the Gaussian blurring used in linear approaches, KAZE uses nonlinear diffusion filtering to build the scale space. This approach helps to preserve edge sharpness and detail better. Key points are detected through a Hessian determinant measure in this space, and each is assigned an orientation based on local gradients. The descriptor is computed using gradient responses around each key point, similar to in SIFT but benefiting from the enhanced edge definition provided by the nonlinear method.

FREAK [60] (Fast Retina Keypoint) is a binary key point descriptor inspired by the human retina’s structure, designed for efficiency and robustness. It operates by first detecting key points using other methods like FAST. Each detected key point is assigned an orientation based on local gradient information, ensuring rotational invariance. FREAK employs a distinctive retinal sampling pattern around each key point, where points are sampled in concentric circles with varying densities, mimicking the human eye’s distribution of photoreceptors. The descriptor is then formed by performing intensity comparisons between pairs of sampled points, resulting in a binary string. This structure allows FREAK to be both computationally efficient and robust, making it suitable for real-time applications.

SuperPoint [61] is a deep-learning-based approach to key point detection and description, leveraging a convolutional neural network (CNN) trained in a self-supervised manner. The method integrates both key point detection and descriptor extraction into a single network, which processes the input image to produce key point locations and their corresponding descriptors simultaneously. The network consists of an encoder–decoder architecture; the encoder extracts feature maps from the image, while the decoder generates key point probability maps and dense descriptors. SuperPoint is known for its high accuracy, robustness to various image transformations, and suitability for tasks requiring advanced feature extraction, such as visual SLAM and image matching.

The specifications of the feature descriptors utilized in the current research are summarized in Table 1.

After extracting the key point descriptors, these features can be utilized to estimate an image quality score. To simplify the process and reduce computational demands, up to the 100 strongest key points and their corresponding descriptors are selected to represent the image. These descriptors are sampled from various regions across the image. While these descriptors primarily encode distinctive image information, the repetitive elements within them can capture quality-related image patterns.

The goal of this method is to identify the highest-quality image from a set, focusing on rank-one performance (ensuring the right image is ranked as the top prediction) rather than the absolute score of the image quality. To address this, a ranking algorithm is employed. The LightGBM [65] ranker model, specifically, the LGBMRanker class from Python’s lightgbm package, is selected for its ability to score images based on degradation signs using a traditional gradient-boosting decision tree approach.

When selecting an algorithm for ranking tasks, LGBMRanker is a superior choice due to its efficiency, scalability, and high predictive accuracy. Its fast training and prediction capabilities make it particularly suitable for large datasets, a common requirement in real-world ranking applications. The algorithm’s inherent ability to handle missing values and automatically perform feature selection further enhances its practicality. In addition, LGBMRanker’s robustness to outliers and overfitting, combined with its integration with popular data science ecosystems, ensures that it is not only high performing, but also easy to implement and deploy.

During the development of the gradient-boosting model, LambdaRank is used as an objective function. Several other parameters are configured as follows: the maximum tree leaves for the base learners (‘num_leaves’) is set to 51, the boosting learning rate (‘learning_rate’) is set to 0.01, and the number of boosted trees to fit (‘n_estimators’) is set to 10,000. Details on the collection and preparation of the training dataset are provided in Section 2.2.

Once trained, the ranker model assesses the quality of images by processing a batch of sampled features (Figure 5). A higher score from the ranker indicates superior image quality, guiding the selection of the highest-quality image.

2.1.2. Baseline Methods

The evaluation of the proposed image quality assessment method’s variants is conducted through a comparison with baseline methods. These include a fast image quality evaluation utilizing image sharpness as a proxy [75] that was used in previous studies [76,77]. Additionally, several no-reference image quality metrics, also known as blind methods, BRISQUE [51], NIQE [52,78,79], PIQE [53,80], and BIQAA [81], are selected for comparison. MATLAB Image Processing Toolbox implementation of these blind methods is used in the experiments. Each of these approaches calculates a quality value without comparing an image to a reference. This is beneficial when the highest-quality image must be selected from a set of slightly varying images, such as those with overlapping camera positions or consecutive video frames.

Here are short summaries of the baseline image quality evaluation methods:

Sharpness uses measured image sharpness as an estimate of image quality. The method applies a Laplacian of Gaussian (LoG) filter with a $3 \times 3$ filter size and $σ = 0.5$ . The variance of the filtered images is calculated, where a higher value indicates greater image sharpness, thus implying lower blurriness;
BRISQUE (Blind/Referenceless Image Spatial Quality Evaluator) is a no-reference image quality assessment model that operates in the spatial domain. It evaluates images based on natural scene statistics, specifically, the locally normalized luminance coefficients. This metric does not require a transformation to another coordinate frame, which differentiates it from many other no-reference IQA approaches. BRISQUE is noted for its simplicity and low computational complexity, making it suitable for real-time applications;
NIQE (Natural Image Quality Evaluator) is a completely blind image quality assessor that uses a statistical model of image features that are perceived as natural based on a Gaussian scale mixture model. It does not require any subjective training using human-rated distorted images, which distinguishes it from other methods that rely on human-rated training sets. NIQE is designed to assess the naturalness of images, making it useful for a variety of applications without the need for comparison to a reference image;
PIQE (Perception-Based Image Quality Evaluator) is a no-reference metric that quantifies the perceptual quality of compressed images by measuring the visibility of artifacts and the loss of natural scene statistics caused by compression. It is particularly useful for evaluating JPEG images as it specifically measures the blockiness and blurriness introduced by JPEG compression. PIQE computes a quality score based on how much an image deviates from these expected statistical parameters, providing a measure of perceptual degradation without reference to the original;
BIQAA (Blind Image Quality Assessment through Anisotropy) is a no-reference image quality assessment metric that evaluates image quality by analyzing the anisotropic properties of natural images. It operates on the premise that high-quality images exhibit certain statistical regularities and directional patterns. BIQAA measures the deviation from these expected anisotropic properties to quantify the degree of distortion present in the image.

2.2. Data Preparation

Training and test datasets were prepared for the experimental evaluation of the proposed method for image quality estimation based on feature descriptors.

The construction of training and test datasets by augmenting the original image is explained in Figure 6.

The objective evaluation of image quality levels can be performed via the evaluation of key point localization errors

E_{i}

:

E_{i} = ∥ K_{0} - K_{i} ∥,

(1)

Here,

K_{0}

and

K_{i}

—sets of matched key points detected in original

I_{0}

and augmented (degraded)

I_{i}

images, respectively. Matched key points are characterized by being the closest in both directions and the distance between key points being less than 3. The rationale for the distance threshold comes from key point matching using the RANSAC algorithm.

Image degradation leads to instability in key point localization. This effect is demonstrated by an experiment that shows how feature localization deteriorates as the amount of motion or defocus blur increases (Figure 2 and Figure 3). Given a set of differently degraded images, key point localization errors

E_{i}

can be used to rank all degraded images accordingly. Ranked degraded images correspond to gradually changing image quality, which correlates with the reliability of key point matching.

The training set construction (Figure 6a) involves the use of motion blur and defocus blur followed by the addition of Gaussian noise. Defocus blur is simulated by a disk-shaped moving average filter with the radius parameter selected from a set of

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

. Motion blur is simulated by a filter that approximates the camera’s linear motion in a random direction, with the motion length parameter selected from a set of

{2, 3, 4, 5, 10, 15, 20, 25}

. The addition of Gaussian noise (zero mean,

0.005

variance) primitively simulates camera sensor noise. The full set of possible degradations is computed, resulting in a set of 18 degraded images for an original image.

The construction of the test set (Figure 6b) is similar in the diversity of simulated blur and noise. The difference is that, for each original image, two randomly selected degradations are generated from all possible degradation functions. In addition, the small affine transformation is added to simulate object capture from slightly different angles.

The image datasets used as image sources in the current research are summarized in Table 2. Images with the largest border greater than 500 px were used in the experiments.

The images were obtained from the following databases:

Indoor Scene Recognition (ISR) (https://web.mit.edu/torralba/www/indoor.html) (accessed on 18 March 2024) [82]. The database contains 67 indoor categories and a total of 15,620 images. The number of images varies across categories, but there are at least 100 images per category. All images are in jpg format;
DIV2K dataset (https://data.vision.ee.ethz.ch/cvl/DIV2K/) (accessed on 18 March 2024) [83,84]. Diverse 2K-resolution high-quality images;
KonIQ-10k IQA Database (https://database.mmsp-kn.de/koniq-10k-database.html) (accessed on 18 March 2024) [85]. KonIQ-10k is, at the time of publication, the largest IQA dataset to date consisting of 10,073 quality-scored images. This is the first in-the-wild database aiming for ecological validity with regard to the authenticity of distortions, the diversity of content, and quality-related indicators;
Common Objects in Context (COCO) val2017 (https://cocodataset.org/#download) (accessed on 18 March 2024) [86]. COCO is a large-scale object detection, segmentation, and captioning dataset;
Flickr2K dataset (https://github.com/limbee/NTIRE2017), (accessed on 15 May 2024) [87]. Dataset of 2650 images collected by SNU_CVLab team for NTIRE2017 super-resolution challenge.

2.3. Evaluation of Methods for Finding Higher-Quality Images

The evaluation of the proposed and baseline methods for identifying higher-quality images was performed using test data. The preparation of the testing dataset is detailed in Section 2.2. The test set is constructed from augmented versions

I_{1}

and

I_{2}

of the original image

I_{0}

. All augmented versions are degraded by randomly selected image transforms: defocus blur or motion blur, noise, and small geometric transform.

At the beginning of the photogrammetry pipeline, the goal is to select only the necessary images of the highest quality for object reconstruction (refer to Figure 1). Selecting the highest-quality image from a collection involves arranging these images by quality and choosing the top one. The test dataset is designed so that the method must select one image from a pair. This method assigns ranking scores to the images based on which most suitable image is selected. Our proposed method assigns a higher score for the better-quality images, similarly to a sharpness-based method. No-reference image quality assessment methods such as BRISQUE, NIQE, and PIQE assign lower scores to denote superior perceptual quality.

2.4. Software Used

The software tools and programming languages used in this research are as follows:

MATLAB programming and numeric computing platform (version R2023b, The Mathworks Inc., Natick, MA, USA) for the implementation of the proposed algorithm (except for training LGBMRanker model and SuperPoint key point detection and description) and for data analysis and visualization;
Python (version 3.11.8) (https://www.python.org) (accessed on 18 March 2024), an interpreted, high-level, general-purpose programming language with NumPy, SciPy, and LightGBM packages. Used for the machine learning applications (LGBMRanker tool) and SuperPoint key point detection and description;
SuperPoint research project at Magic Leap (https://github.com/magicleap/SuperPointPretrainedNetwork, accessed on 15 May 2024) [61]. The repository contains the pretrained SuperPoint network. This is a Python implementation of the SuperPoint feature point detector and descriptor that was used in the descriptor comparison experiments.

3. Results and Discussion

This research focuses on evaluation of various methods for assessing image quality in the context of photogrammetry. The research aims to determine the most effective technique for selecting the highest-quality images, which is essential for accurate and efficient 3D model reconstruction. This research proposes a method for image quality evaluation based on key point descriptors. The method reuses the descriptors extracted for feature matching, which is performed in the photogrammetric sparse reconstruction stage, thus minimizing the computational overhead of image quality evaluation.

An experimental comparison of key-point-descriptor-based methods against traditional no-reference image quality metrics, as well as a sharpness-based approach, was conducted across multiple image datasets. The results of the experimental comparison of the methods for selecting the higher-quality image from a pair of images are summarized in Table 3 and in Figure 7.

The findings show that the proposed descriptor-based methods, particularly those utilizing SURF (64- and 128-dimensionality variants), SIFT, and KAZE (64- and 128-dimensionality variants) descriptors, consistently outperform other evaluated techniques, suggesting that they are advantageous for photogrammetric applications. This performance is likely due to the descriptors’ ability to encapsulate more detailed information about image characteristics such as texture, sharpness, and noise, which are crucial for photogrammetry. Additionally, both SURF and KAZE descriptors share a common characteristic, the highest numerical precision, as shown in Table 1. This attribute likely contributes significantly to their superior performance. Though BRISK and ORB descriptors-based rankers perform adequately, they do not reach the effectiveness of SURF, SIFT, or KAZE. BRISK, ORB, and FREAK rankers do not perform as well as SURF, SIFT, and KAZE, possibly due to the binary nature of their descriptors, which may not capture subtle quality variations as effectively.

The SuperPoint ranker method demonstrates a relatively strong performance among the key-point-descriptor-based approaches. Despite not achieving the highest performance, SuperPoint’s efficiency lies in its deep-learning-based architecture, which allows it to capture more complex and abstract image features. The drawback of this method is its availability in photogrammetric reconstruction software. The rationale for using descriptor-based image quality evaluation is to reuse already computed data, thus minimizing the computational overhead of image quality evaluation. The photogrammetry software mainly utilizes handcrafted key point detectors and descriptors. Therefore, deep-learning-based descriptor availability would be limited in the proposed setup.

Traditional no-reference metrics, such as BRISQUE, NIQE, PIQE, and BIQAA, fall behind in this specific application. These methods do not perform as well, likely due to their inability to detect subtle degradations or due to their universality and applicability for evaluating image perceptual quality. Similarly, the sharpness-based method, although relatively effective, does not match the performance of the best descriptor-based approaches. This highlights that sharpness alone may not be a sufficient indicator of image quality in the context of photogrammetry, where other factors like texture, noise, and detail integrity play crucial roles.

Practical applications of this method extend across various fields where high-quality 3D reconstructions are critical. In the field of archaeology, accurate 3D models are essential for preserving and studying artifacts and sites. The proposed method can enhance the quality of these models, ensuring that minute details are captured accurately. In construction and civil engineering, high-quality images are vital for creating precise models of buildings and infrastructure. The method can improve the accuracy of these models, facilitating better planning, monitoring, and maintenance processes. Additionally, in the field of cultural heritage preservation, the method can aid in creating detailed and accurate digital replicas of historical monuments and artworks, which is crucial for restoration and virtual tourism.

One area of future research involves investigating the potential of combining feature descriptors with various ranking algorithms. Additionally, exploring ensemble methods that combine multiple ranking algorithms could provide a more robust and generalized image quality assessment framework.

Another future research direction could be integrating these image quality evaluation techniques into existing photogrammetry software. This integration will allow for comprehensive testing of the methods on real-world objects and the subsequent evaluation of their impact on 3D reconstruction accuracy. Comparing the reconstructed 3D models with ground truth models will provide insights into the effectiveness of the proposed methods in practical applications.

Another direction for future research could be integration of descriptor-based evaluation methods with advanced machine-learning-based image quality metrics. By combining the information captured by feature descriptors with the predictive power of machine learning models, it is possible to achieve higher accuracy and efficiency in image quality assessment.

While the proposed method shows promise, it also has limitations that need to be addressed. One major limitation is its dependence on the key point descriptors. The availability of certain descriptors in photogrammetric software packages could limit the immediate adoption of this method. The rationale of reusing already extracted descriptors restricts the usage of descriptors to those selected for use in the 3D reconstruction pipeline.

By integrating descriptor-based image quality assessment into photogrammetric workflows, users can ensure that only the highest-quality images are used. This not only enhances the accuracy and reliability of the reconstructed models but also optimizes the entire reconstruction process by reducing unnecessary computational overhead.

4. Conclusions

The efficiency of photogrammetry workflows significantly depends on the integrity and quality of the initial images. As such, employing robust image quality evaluation mechanisms is imperative. Automated image quality evaluation promises an improvement in the processing speed and reliability of photogrammetric methods, facilitating more accurate and detailed 3D models.

This research presents a novel approach to the evaluation of image quality specifically tailored for photogrammetric applications leveraging the capabilities of key point descriptors to capture image quality. Through a comprehensive analysis employing the LightGBM ranker model and various key point descriptors, this study demonstrates that the proposed method can enhance the selection process of high-quality images essential for accurate 3D model reconstruction.

The study’s core findings indicate that descriptor-based methods, particularly those utilizing SURF, SIFT, and KAZE descriptors, are superior in selecting high-quality images compared to traditional no-reference quality metrics such as BRISQUE, NIQE, PIQE, and BIQAA, or the simple sharpness evaluation method. These methods, which capitalize on the inherent properties of the key point descriptors for feature matching in photogrammetry, not only provide a reliable means of assessing image quality, but also offer a pragmatic solution to one of the challenging steps in photogrammetry, the selection of high-quality images for 3D modeling.

Future research should explore combining feature descriptors with various ranking algorithms and ensemble methods to enhance image quality assessment. Integrating these evaluation techniques into existing photogrammetry software will allow for comprehensive testing and validation against ground truth models. Additionally, combining descriptor-based methods with advanced machine-learning-based quality metrics could improve accuracy and efficiency. However, limitations include dependency on available key point descriptors in photogrammetry software, which may restrict immediate adoption and use.

Author Contributions

Conceptualization, D.M. (Dalius Matuzevičius), V.U., D.M. (Darius Miniotas), Š.M., R.L. and A.U.; Data curation, D.M. (Dalius Matuzevičius) and A.U.; Formal analysis, Š.M.; Investigation, D.M. (Dalius Matuzevičius), V.U., D.M. (Darius Miniotas), R.L. and A.U.; Methodology, D.M. (Dalius Matuzevičius), V.U., D.M. (Darius Miniotas), Š.M., R.L. and A.U.; Resources, D.M. (Dalius Matuzevičius) and V.U.; Software, D.M. (Dalius Matuzevičius), Š.M. and R.L.; Supervision, D.M. (Dalius Matuzevičius) and V.U.; Validation, D.M. (Dalius Matuzevičius), V.U., D.M. (Darius Miniotas), Š.M., R.L. and A.U.; Visualization, D.M. (Dalius Matuzevičius), Š.M. and R.L.; Writing—original draft, D.M. (Darius Miniotas), R.L. and A.U.; Writing—review and editing, D.M. (Dalius Matuzevičius), V.U., D.M. (Darius Miniotas), Š.M., R.L. and A.U. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data underlying the results are available as part of the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SfM	Structure from motion
ML	Machine learning
CNN	Convolutional neural network
LoG	Laplacian of Gaussian
DoG	Difference of Gaussians

References

Li, Z.; Zhang, Z.; Luo, S.; Cai, Y.; Guo, S. An Improved Matting-SfM Algorithm for 3D Reconstruction of Self-Rotating Objects. Mathematics 2022, 10, 2892. [Google Scholar] [CrossRef]
Matuzevičius, D.; Serackis, A.; Navakauskas, D. Mathematical models of oversaturated protein spots. Elektron. Elektrotechnika 2007, 73, 63–68. [Google Scholar]
Gabara, G.; Sawicki, P. CRBeDaSet: A benchmark dataset for high accuracy close range 3D object reconstruction. Remote. Sens. 2023, 15, 1116. [Google Scholar] [CrossRef]
Matuzevičius, D. Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models. Appl. Sci. 2022, 12, 4393. [Google Scholar] [CrossRef]
Eldefrawy, M.; King, S.A.; Starek, M. Partial scene reconstruction for close range photogrammetry using deep learning pipeline for region masking. Remote. Sens. 2022, 14, 3199. [Google Scholar] [CrossRef]
Caradonna, G.; Tarantino, E.; Scaioni, M.; Figorito, B. Multi-image 3D reconstruction: A photogrammetric and structure from motion comparative analysis. In Proceedings of the International Conference on Computational Science and Its Applications, Melbourne, VIC, Australia, 2–5 July 2018; pp. 305–316. [Google Scholar]
Žuraulis, V.; Matuzevičius, D.; Serackis, A. A method for automatic image rectification and stitching for vehicle yaw marks trajectory estimation. Promet-Traffic Transp. 2016, 28, 23–30. [Google Scholar] [CrossRef]
Sledevič, T.; Serackis, A.; Plonis, D. FPGA Implementation of a Convolutional Neural Network and Its Application for Pollen Detection upon Entrance to the Beehive. Agriculture 2022, 12, 1849. [Google Scholar] [CrossRef]
Ban, K.; Jung, E.S. Ear shape categorization for ergonomic product design. Int. J. Ind. Ergon. 2020, 80, 102962. [Google Scholar] [CrossRef]
Mistretta, F.; Sanna, G.; Stochino, F.; Vacca, G. Structure from motion point clouds for structural monitoring. Remote. Sens. 2019, 11, 1940. [Google Scholar] [CrossRef]
Varna, D.; Abromavičius, V. A System for a Real-Time Electronic Component Detection and Classification on a Conveyor Belt. Appl. Sci. 2022, 12, 5608. [Google Scholar] [CrossRef]
Vacca, G. Overview of open source software for close range photogrammetry. In Proceedings of the 2019 Free and Open Source Software for Geospatial, FOSS4G 2019, Bucharest, Romania, 26–30 August 2019; Volume 42, pp. 239–245. [Google Scholar]
Pang, T.Y.; Lo, T.S.T.; Ellena, T.; Mustafa, H.; Babalija, J.; Subic, A. Fit, stability and comfort assessment of custom-fitted bicycle helmet inner liner designs, based on 3D anthropometric data. Appl. Ergon. 2018, 68, 240–248. [Google Scholar] [CrossRef] [PubMed]
Matuzevicius, D.; Navakauskas, D. Feature selection for segmentation of 2-D electrophoresis gel images. In Proceedings of the 2008 11th International Biennial Baltic Electronics Conference, Tallinn, Estonia, 14 April 2008; pp. 341–344. [Google Scholar]
Luhmann, T. Close range photogrammetry for industrial applications. Isprs J. Photogramm. Remote. Sens. 2010, 65, 558–569. [Google Scholar] [CrossRef]
Trojnacki, M.; Dąbek, P.; Jaroszek, P. Analysis of the Influence of the Geometrical Parameters of the Body Scanner on the Accuracy of Reconstruction of the Human Figure Using the Photogrammetry Technique. Sensors 2022, 22, 9181. [Google Scholar] [CrossRef]
Barbero-García, I.; Pierdicca, R.; Paolanti, M.; Felicetti, A.; Lerma, J.L. Combining machine learning and close-range photogrammetry for infant’s head 3D measurement: A smartphone-based solution. Measurement 2021, 182, 109686. [Google Scholar] [CrossRef]
Leipner, A.; Obertová, Z.; Wermuth, M.; Thali, M.; Ottiker, T.; Sieberth, T. 3D mug shot—3D head models from photogrammetry for forensic identification. Forensic Sci. Int. 2019, 300, 6–12. [Google Scholar] [CrossRef]
Abromavičius, V.; Serackis, A. Eye and EEG activity markers for visual comfort level of images. Biocybern. Biomed. Eng. 2018, 38, 810–818. [Google Scholar] [CrossRef]
Battistoni, G.; Cassi, D.; Magnifico, M.; Pedrazzi, G.; Di Blasio, M.; Vaienti, B.; Di Blasio, A. Does Head Orientation Influence 3D Facial Imaging? A Study on Accuracy and Precision of Stereophotogrammetric Acquisition. Int. J. Environ. Res. Public Health 2021, 18, 4276. [Google Scholar] [CrossRef] [PubMed]
Abromavicius, V.; Serackis, A.; Katkevicius, A.; Plonis, D. Evaluation of EEG-based Complementary Features for Assessment of Visual Discomfort based on Stable Depth Perception Time. Radioengineering 2018, 27, 1138–1146. [Google Scholar] [CrossRef]
Trujillo-Jiménez, M.A.; Navarro, P.; Pazos, B.; Morales, L.; Ramallo, V.; Paschetta, C.; De Azevedo, S.; Ruderman, A.; Pérez, O.; Delrieux, C.; et al. body2vec: 3D Point Cloud Reconstruction for Precise Anthropometry with Handheld Devices. J. Imaging 2020, 6, 94. [Google Scholar] [CrossRef]
Zeraatkar, M.; Khalili, K. A Fast and Low-Cost Human Body 3D Scanner Using 100 Cameras. J. Imaging 2020, 6, 21. [Google Scholar] [CrossRef]
Verwulgen, S.; Lacko, D.; Vleugels, J.; Vaes, K.; Danckaers, F.; De Bruyne, G.; Huysmans, T. A new data structure and workflow for using 3D anthropometry in the design of wearable products. Int. J. Ind. Ergon. 2018, 64, 108–117. [Google Scholar] [CrossRef]
Barbero-García, I.; Lerma, J.L.; Mora-Navarro, G. Fully automatic smartphone-based photogrammetric 3D modelling of infant’s heads for cranial deformation analysis. Isprs J. Photogramm. Remote. Sens. 2020, 166, 268–277. [Google Scholar] [CrossRef]
Kuo, C.C.; Wang, M.J.; Lu, J.M. Developing sizing systems using 3D scanning head anthropometric data. Measurement 2020, 152, 107264. [Google Scholar] [CrossRef]
Zhao, Y.; Mo, Y.; Sun, M.; Zhu, Y.; Yang, C. Comparison of three-dimensional reconstruction approaches for anthropometry in apparel design. J. Text. Inst. 2019, 110, 1635–1643. [Google Scholar] [CrossRef]
Galantucci, L.M.; Lavecchia, F.; Percoco, G. 3D Face measurement and scanning using digital close range photogrammetry: Evaluation of different solutions and experimental approaches. In Proceedings of the International Conference on 3D Body Scanning Technologies, Lugano, Switzerland, 9–20 October 2010; p. 52. [Google Scholar]
Özyeşil, O.; Voroninski, V.; Basri, R.; Singer, A. A survey of structure from motion. Acta Numer. 2017, 26, 305–364. [Google Scholar] [CrossRef]
Iglhaut, J.; Cabo, C.; Puliti, S.; Piermattei, L.; O’Connor, J.; Rosette, J. Structure from motion photogrammetry in forestry: A review. Curr. For. Rep. 2019, 5, 155–168. [Google Scholar] [CrossRef]
Wei, Y.m.; Kang, L.; Yang, B.; Wu, L.-d. Applications of structure from motion: A survey. J. Zhejiang Univ. Sci. 2013, 14, 486–494. [Google Scholar] [CrossRef]
Westoby, M.J.; Brasington, J.; Glasser, N.F.; Hambrey, M.J.; Reynolds, J.M. ‘Structure-from-Motion’photogrammetry: A low-cost, effective tool for geoscience applications. Geomorphology 2012, 179, 300–314. [Google Scholar] [CrossRef]
Yao, G.; Huang, P.; Ai, H.; Zhang, C.; Zhang, J.; Zhang, C.; Wang, F. Matching wide-baseline stereo images with weak texture using the perspective invariant local feature transformer. J. Appl. Remote. Sens. 2022, 16, 036502. [Google Scholar] [CrossRef]
Wei, L.; Huo, J. A Global fundamental matrix estimation method of planar motion based on inlier updating. Sensors 2022, 22, 4624. [Google Scholar] [CrossRef]
Heymsfield, S.B.; Bourgeois, B.; Ng, B.K.; Sommer, M.J.; Li, X.; Shepherd, J.A. Digital anthropometry: A critical review. Eur. J. Clin. Nutr. 2018, 72, 680–687. [Google Scholar] [CrossRef]
Calantropio, A.; Chiabrando, F.; Seymour, B.; Kovacs, E.; Lo, E.; Rissolo, D. Image pre-processing strategies for enhancing photogrammetric 3D reconstruction of underwater shipwreck datasets. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2020, 43, 941–948. [Google Scholar] [CrossRef]
Ballabeni, A.; Apollonio, F.I.; Gaiani, M.; Remondino, F. Advances in image pre-processing to improve automated 3D reconstruction. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2015, 40, 315–323. [Google Scholar] [CrossRef]
Gaiani, M.; Remondino, F.; Apollonio, F.I.; Ballabeni, A. An advanced pre-processing pipeline to improve automated photogrammetric reconstructions of architectural scenes. Remote. Sens. 2016, 8, 178. [Google Scholar] [CrossRef]
Neyer, F.; Nocerino, E.; Grün, A. Image quality improvements in low-cost underwater photogrammetry. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2019, 42, 135–142. [Google Scholar] [CrossRef]
Li, Z.; Yuan, X.; Lam, K.W. Effects of JPEG compression on the accuracy of photogrammetric point determination. Photogramm. Eng. Remote. Sens. 2002, 68, 847–853. [Google Scholar]
O’Connor, J. Impact of Image Quality on SfM Photogrammetry: Colour, Compression and Noise. Ph.D. Thesis, Kingston University, London, UK, September. 2018. [Google Scholar]
Song, F. Analysis of Image Quality Evaluation Technology of Photogrammetry and Remote Sensing Fusion. In Innovative Computing: Proceedings of the 4th International Conference on Innovative Computing (IC 2021); Springer: Berlin/Heidelberg, Germany, 2022; pp. 141–146. [Google Scholar]
Saponaro, M.; Capolupo, A.; Tarantino, E.; Fratino, U. Comparative analysis of different UAV-based photogrammetric processes to improve product accuracies. In Computational Science and Its Applications – ICCSA 2019. Lecture Notes in Computer Science, Proceedings of the Computational Science and Its Applications–ICCSA 2019: 19th International Conference, Saint Petersburg, Russia, 1–4 July 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 225–238. [Google Scholar]
Karantanellis, E.; Arav, R.; Dille, A.; Lippl, S.; Marsy, G.; Torresani, L.; Oude Elberink, S. Evaluating the quality of photogrammetric point-clouds in challenging geo-environments–a case study in an Alpine Valley. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2020, 43, 1099–1105. [Google Scholar] [CrossRef]
Ludwig, M.; M. Runge, C.; Friess, N.; Koch, T.L.; Richter, S.; Seyfried, S.; Wraase, L.; Lobo, A.; Sebastià, M.T.; Reudenbach, C.; et al. Quality assessment of photogrammetric methods—A workflow for reproducible UAS orthomosaics. Remote. Sens. 2020, 12, 3831. [Google Scholar] [CrossRef]
Welch, R. Photogrammetric image evaluation techniques. Photogrammetria 1975, 31, 161–190. [Google Scholar] [CrossRef]
Barbero-García, I.; Cabrelles, M.; Lerma, J.L.; Marqués-Mateu, Á. Smartphone-based close-range photogrammetric assessment of spherical objects. Photogramm. Rec. 2018, 33, 283–299. [Google Scholar] [CrossRef]
Fawzy, H.E.D. The accuracy of mobile phone camera instead of high resolution camera in digital close range photogrammetry. Int. J. Civ. Eng. Technol. 2015, 6, 76–85. [Google Scholar]
Barbero-García, I.; Lerma, J.L.; Marqués-Mateu, Á.; Miranda, P. Low-cost smartphone-based photogrammetry for the analysis of cranial deformation in infants. World Neurosurg. 2017, 102, 545–554. [Google Scholar] [CrossRef]
Lerma, J.L.; Barbero-García, I.; Marqués-Mateu, Á.; Miranda, P. Smartphone-based video for 3D modelling: Application to infant’s cranial deformation analysis. Measurement 2018, 116, 299–306. [Google Scholar] [CrossRef]
Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]
Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]
Venkatanath, N.; Praneeth, D.; Bh, M.C.; Channappayya, S.S.; Medasani, S.S. Blind image quality evaluation using perception based features. In Proceedings of the 2015 Twenty First National Conference on Communications (NCC), Mumbai, India, 27 February–1 March 2015; pp. 1–6. [Google Scholar]
Lowe, G. Sift-the scale invariant feature transform. Int. J 2004, 2, 2. [Google Scholar]
Lingua, A.; Marenchino, D.; Nex, F. Performance analysis of the SIFT operator for automatic feature extraction and matching in photogrammetric applications. Sensors 2009, 9, 3745–3766. [Google Scholar] [CrossRef]
Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-up robust features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
Leutenegger, S.; Chli, M.; Siegwart, R.Y. BRISK: Binary robust invariant scalable keypoints. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2548–2555. [Google Scholar]
Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
Alcantarilla, P.F.; Bartoli, A.; Davison, A.J. KAZE features. In Computer Vision – ECCV 2012. Lecture Notes in Computer Science, Proceedings of the Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 214–227. [Google Scholar]
Alahi, A.; Ortiz, R.; Vandergheynst, P. Freak: Fast retina keypoint. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognitio, Providence, RI, USA, 16–21 June 2012; pp. 510–517. [Google Scholar]
DeTone, D.; Malisiewicz, T.; Rabinovich, A. SuperPoint: Self-supervised interest point detection and description. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–23 June 2018; pp. 224–236.
Petrakis, G.; Partsinevelos, P. Keypoint Detection and Description through Deep Learning in Unstructured Environments. Robotics 2023, 12, 137. [Google Scholar] [CrossRef]
Georgiou, T.; Liu, Y.; Chen, W.; Lew, M. A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. Int. J. Multimed. Inf. Retr. 2020, 9, 135–170. [Google Scholar] [CrossRef]
Fan, Y.; Mao, S.; Li, M.; Kang, J.; Li, B. LMFD: Lightweight multi-feature descriptors for image stitching. Sci. Rep. 2023, 13, 21162. [Google Scholar] [CrossRef]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3149–3157. [Google Scholar]
Eltner, A.; Sofia, G. Structure from motion photogrammetric technique. In Developments in Earth Surface Processes; Elsevier: Amsterdam, The Netherlands, 2020; Volume 23, pp. 1–24. [Google Scholar]
Urban, S.; Weinmann, M. Finding a good feature detector-descriptor combination for the 2D keypoint-based registration of TLS point clouds. Isprs Ann. Photogramm. Remote. Sens. Apatial Inf. Sci. 2015, 2, 121–128. [Google Scholar] [CrossRef]
Wu, S.; Oerlemans, A.; Bakker, E.M.; Lew, M.S. A comprehensive evaluation of local detectors and descriptors. Signal Process. Image Commun. 2017, 59, 150–167. [Google Scholar] [CrossRef]
Krig, S.; Krig, S. Interest point detector and feature descriptor survey. Comput. Vis. Metrics Textb. Ed. 2016, 187–246. [Google Scholar]
Marmol, A.; Peynot, T.; Eriksson, A.; Jaiprakash, A.; Roberts, J.; Crawford, R. Evaluation of keypoint detectors and descriptors in arthroscopic images for feature-based matching applications. IEEE Robot. Autom. Lett. 2017, 2, 2135–2142. [Google Scholar] [CrossRef]
Kelman, A.; Sofka, M.; Stewart, C.V. Keypoint descriptors for matching across multiple image modalities and non-linear intensity variations. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–7. [Google Scholar]
Sharma, S.K.; Jain, K.; Shukla, A.K. A comparative analysis of feature detectors and descriptors for image stitching. Appl. Sci. 2023, 13, 6015. [Google Scholar] [CrossRef]
Bojanić, D.; Bartol, K.; Pribanić, T.; Petković, T.; Donoso, Y.D.; Mas, J.S. On the comparison of classic and deep keypoint detector and descriptor methods. In Proceedings of the 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA), Dubrovnik, Croatia, 23–25 September 2019; pp. 64–69. [Google Scholar]
Işık, Ş. A comparative evaluation of well-known feature detectors and descriptors. Int. J. Appl. Math. Electron. Comput. 2014, 3, 1–6. [Google Scholar] [CrossRef]
Santos, A.; Ortiz de Solórzano, C.; Vaquero, J.J.; Pena, J.M.; Malpica, N.; del Pozo, F. Evaluation of autofocus functions in molecular cytogenetic analysis. J. Microsc. 1997, 188, 264–272. [Google Scholar] [CrossRef]
Matuzevičius, D.; Serackis, A. Three-Dimensional Human Head Reconstruction Using Smartphone-Based Close-Range Video Photogrammetry. Appl. Sci. 2021, 12, 229. [Google Scholar] [CrossRef]
Tamulionis, M.; Sledevič, T.; Abromavičius, V.; Kurpytė-Lipnickė, D.; Navakauskas, D.; Serackis, A.; Matuzevičius, D. Finding the least motion-blurred image by reusing early features of object detection network. Appl. Sci. 2023, 13, 1264. [Google Scholar] [CrossRef]
Mittal, A.; Soundararajan, R.; Muralidhar, G.S.; Bovik, A.C.; Ghosh, J. Blind image quality assessment without training on human opinion scores. In Proceedings of the Human Vision and Electronic Imaging XVIII, Burlingame, CA, USA, 4–8 February 2013; Volume 8651, pp. 177–183. [Google Scholar]
Mittal, A.; Soundararajan, R.; Bovik, A. Prediction of image naturalness and quality. J. Vis. 2013, 13, 1056. [Google Scholar] [CrossRef]
Chan, R.W.; Goldsmith, P.B. A psychovisually-based image quality evaluator for JPEG images. In Proceedings of the Smc 2000 Conference Proceedings, 2000 IEEE International Conference on Systems, Man and Cybernetics: Cybernetics Evolving to Systems, Humans, Organizations, and Their Complex Interactions, Anchorage, AK, USA, 27 September 2000; Volume 2, pp. 1541–1546. [Google Scholar]
Gabarda, S.; Cristóbal, G. Blind image quality assessment through anisotropy. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 2007, 24, B42–B51. [Google Scholar] [CrossRef] [PubMed]
Quattoni, A.; Torralba, A. Recognizing indoor scenes. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 413–420. [Google Scholar]
Agustsson, E.; Timofte, R. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. In Proceedings of the The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Ignatov, A.; Timofte, R.; Van Vu, T.; Minh Luu, T.; X Pham, T.; Van Nguyen, C.; Kim, Y.; Choi, J.S.; Kim, M.; Huang, J.; et al. PIRM challenge on perceptual image enhancement on smartphones: Report. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
Hosu, V.; Lin, H.; Sziranyi, T.; Saupe, D. KonIQ-10k: An Ecologically Valid Database for Deep Learning of Blind Image Quality Assessment. IEEE Trans. Image Process. 2020, 29, 4041–4056. [Google Scholar] [CrossRef] [PubMed]
Lin, T.; Maire, M.; Belongie, S.J.; Bourdev, L.D.; Girshick, R.B.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. arXiv 2014, arXiv:1405.0312. [Google Scholar]
Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In Proceedings of the The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]

Figure 1. There is a need for and a problem with identifying the highest-quality images from a collection during 3D object reconstruction employing a photogrammetry method. Collecting images for 3D reconstruction can result in redundant images, particularly if the object is photographed extensively or filmed. Selecting a subset of images reduces data redundancy and eliminates lower-quality images, thereby speeding up the reconstruction process and improving the accuracy of the resulting model.

Figure 2. Dependence of key point localization error on the image degradation level. Image degradation is simulated using motion blur, followed by the addition of noise. All key point detectors (SURF (a), SIFT (b), BRISK (c), ORB (d), KAZE (e), FAST (f), and SuperPoint (g)) show increasing localization error as motion blur strengthens. Different colored lines (150 in total) represent the dynamics of localization errors related to separate images. This simulation proves the importance of finding the highest-quality images for photogrammetry reconstruction. It also lays the foundation for the objective evaluation of image quality level through the evaluation of key point localization errors. Therefore, key point localization error can be used to rank augmented images according to their degradation level.

Figure 3. Dependence of key point localization error on the image degradation level. Image degradation is simulated using defocus blur, followed by the addition of noise. All key point detectors (SURF (a), SIFT (b), BRISK (c), ORB (d), KAZE (e), FAST (f), and SuperPoint (g)) show increasing localization error as defocus blur strengthens. Different colored lines (150 in total) represent the dynamics of localization errors related to separate images.

Figure 4. Overview of the generalized sparse 3D reconstruction workflow including the introduced image quality evaluation step (5th block). Image quality evaluation and ranking of the closest images according to their quality allow selection of a subset of images, reduction of data redundancy, elimination of low-quality images, acceleration of the reconstruction process, and improvement of the accuracy of the reconstruction result. Descriptor-based image quality evaluation is performed after feature descriptors are extracted. The quality information can be used to select the highest-quality images from clusters of similar images or from consecutive video frames (6th block) or after approximate camera locations are computed in the iterative structure from the motion step (8th block).

Figure 5. Workflow chart of selecting the highest-quality image from a subset of images. Current workflow elaborates the descriptor-based image quality evaluation (5th step) of the workflow presented in Figure 4.

Figure 6. Construction of the training (a) and test (b) datasets by augmenting original images. Here,

I_{0}

—original image;

I_{1}

,

I_{2}

,

I_{3}

,

. . .

,

I_{N}

—augmented images using N different image augmentation transforms

f_{i}

leading to the degraded images

I_{i}

, respectively (transforms include motion blur or defocus blur of various strengths followed by the corruption of Gaussian noise);

K_{i}

—sets of extracted key points;

E_{i}

—errors of key point localization due to image quality degradation. Errors are used to derive an objective ranking of image degradations (

r_{i}

). A fixed number of strongest key points are selected to extract descriptors that serve as input features for the ranker model. Training set (a) is constructed via degrading images using a set of predefined degrading transforms

f_{i}

. Test set (b) is constructed by degrading each original image by two randomly selected transforms

f_{1}

and

f_{2}

.

Figure 6. Construction of the training (a) and test (b) datasets by augmenting original images. Here,

I_{0}

—original image;

I_{1}

,

I_{2}

,

I_{3}

,

. . .

,

I_{N}

—augmented images using N different image augmentation transforms

f_{i}

leading to the degraded images

I_{i}

, respectively (transforms include motion blur or defocus blur of various strengths followed by the corruption of Gaussian noise);

K_{i}

—sets of extracted key points;

E_{i}

—errors of key point localization due to image quality degradation. Errors are used to derive an objective ranking of image degradations (

r_{i}

). A fixed number of strongest key points are selected to extract descriptors that serve as input features for the ranker model. Training set (a) is constructed via degrading images using a set of predefined degrading transforms

f_{i}

. Test set (b) is constructed by degrading each original image by two randomly selected transforms

f_{1}

and

f_{2}

.

Figure 7. Graphical summary of the experimental results of methods comparison. The results are the percentage of correct selections of the higher-quality image from a pair of images. The color of the bars represents the dataset from which the test results came. A solid line above the bars indicates the overall accuracy of the method. The names of the methods are shown on the horizontal axis. Suffixes “64” and “128” near the descriptor name denote the dimensionality of the descriptor. The proposed key-point-descriptor-based image quality evaluation methods are named by descriptor name and “ranker” suffix.

Table 1. Summary of the feature descriptors investigated in the current research for suitability to provide useful information for prediction of image quality.

Descriptor	Dimensionality	Precision
SURF	64/128	single/float
SIFT	128	8-bit unsigned integer (uint8)
BRISK	64	binary, stored in uint8 container
ORB (Rotated BRIEF)	32	binary, stored in uint8 container
KAZE	64/128	single/float
FREAK	64	binary, stored in uint8 container
SuperPoint	256	single/float

Table 2. Summary of the datasets used in the current research for training and testing. Number of samples indicates number of images selected to be used in the experiments.

Dataset	No. of Samples	Resolution
– For training:
ISR [82]	3318	$(501 - 4288) \times (209 - 3176)$
– For testing:
ISR	$1106 \times 10$	$(501 - 4288) \times (209 - 3176)$
DIV2K [83,84]	800	$2024 \times (648 - 2040)$
KonIQ-10k IQA [85]	$10, 373$	$1024 \times 768$
COCO val2017 [86]	4231	$(520 - 640) \times (159 - 520)$
Flickr2K [87]	2650	$(1536 - 2040) \times (816 - 2040)$

Table 3. Results of the method comparison for selecting the higher-quality image from a pair of images. Values in the table are the percentages of selections that are correct. The first five methods (1–5) are the baseline image quality evaluation methods. The rest of the methods (6–14) are proposed key-point-descriptor-based image quality evaluation methods. Suffixes “64” and “128” near the descriptor name denote the dimensionality of the descriptor.

Method	Results of Selecting a Better-Quality Image from a Pair [% Correct Selections]
Method	ISR (Test)	DIV2K	KonIQ10kIQA	COCO	Flickr2K	Overall
1. Sharpness	78.0	78.4	78.2	77.5	76.0	77.6
2. BRISQUE	73.8	73.6	72.2	73.1	72.6	73.1
3. NIQE	76.4	77.7	76.5	77.1	79.6	77.5
4. PIQE	73.0	74.5	73.7	73.7	78.9	74.8
5. BIQAA	65.8	64.8	65.1	63.2	60.5	63.9
6. SURF64-ranker	84.1	83.5	84.7	84.8	82.4	83.9
7. SURF128-ranker	82.9	82.9	83.1	83.1	81.5	82.7
8. SIFT-ranker	86.0	85.3	84.4	87.3	85.1	85.6
9. BRISK-ranker	79.1	79.3	78.3	81.2	76.3	78.8
10. ORB-ranker	77.1	78.9	78.6	79.1	76.1	78.2
11. KAZE64-ranker	85.1	85.6	85.6	85.5	83.4	85.0
12. KAZE128-ranker	85.4	85.8	87.2	84.7	84.7	85.6
13. FREAK-ranker	69.8	73.2	70.7	69.4	67.8	70.2
14. SuperPoint-ranker	82.2	81.8	80.4	77.4	79.4	80.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Matuzevičius, D.; Urbanavičius, V.; Miniotas, D.; Mikučionis, Š.; Laptik, R.; Ušinskas, A. Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows. Electronics 2024, 13, 2112. https://doi.org/10.3390/electronics13112112

AMA Style

Matuzevičius D, Urbanavičius V, Miniotas D, Mikučionis Š, Laptik R, Ušinskas A. Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows. Electronics. 2024; 13(11):2112. https://doi.org/10.3390/electronics13112112

Chicago/Turabian Style

Matuzevičius, Dalius, Vytautas Urbanavičius, Darius Miniotas, Šarūnas Mikučionis, Raimond Laptik, and Andrius Ušinskas. 2024. "Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows" Electronics 13, no. 11: 2112. https://doi.org/10.3390/electronics13112112

APA Style

Matuzevičius, D., Urbanavičius, V., Miniotas, D., Mikučionis, Š., Laptik, R., & Ušinskas, A. (2024). Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows. Electronics, 13(11), 2112. https://doi.org/10.3390/electronics13112112

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows

Abstract

1. Introduction

2. Materials and Methods

2.1. Motivation and Method for Finding Higher-Quality Images

2.1.1. Proposed Method

2.1.2. Baseline Methods

2.2. Data Preparation

2.3. Evaluation of Methods for Finding Higher-Quality Images

2.4. Software Used

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI