Next Article in Journal
A Digital Grid Model for Complex Time-Varying Environments in Civil Engineering Buildings
Previous Article in Journal
Spatial Statistical Prediction of Solar-Induced Chlorophyll Fluorescence (SIF) from Multivariate OCO-2 Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Open Image Resizing Framework for Remote Sensing Applications and Beyond

by
Donatella Occorsio
1,†,
Giuliana Ramella
2,*,† and
Woula Themistoclakis
2,†
1
Department of Mathematics and Computer Science, University of Basilicata, Viale dell’Ateneo Lucano 10, 85100 Potenza, Italy
2
National Research Council (C.N.R.), Institute for Applied Computing “Mauro Picone”, Via P. Castellino, 111, 80131 Naples, Italy
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2023, 15(16), 4039; https://doi.org/10.3390/rs15164039
Submission received: 2 August 2023 / Revised: 8 August 2023 / Accepted: 8 August 2023 / Published: 15 August 2023

Abstract

:
Image resizing (IR) has a crucial role in remote sensing (RS), since an image’s level of detail depends on the spatial resolution of the acquisition sensor; its design limitations; and other factors such as (a) the weather conditions, (b) the lighting, and (c) the distance between the satellite platform and the ground targets. In this paper, we assessed some recent IR methods for RS applications (RSAs) by proposing a useful open framework to study, develop, and compare them. The proposed framework could manage any kind of color image and was instantiated as a Matlab package made freely available on Github. Here, we employed it to perform extensive experiments across multiple public RS image datasets and two new datasets included in the framework to evaluate, qualitatively and quantitatively, the performance of each method in terms of image quality and statistical measures.

1. Introduction

Remote sensing (RS) technology plays a crucial role in many fields, since it provides an efficient way to access a wide variety of information in real time for acquiring, detecting, analyzing, and monitoring the physical characteristics of an object or area without having any physical contact with it. Specifically, geoscience is one of the major fields in which RS technology is used to quantitatively and qualitatively study weather, forestry, agriculture, surface changes, biodiversity, and so on. The applications of geoscience extend far beyond mere data collection, as it aims to inform international policies through, for instance, environment monitoring, catastrophe prediction, and resource investigation.
The source of RS data is the electromagnetic radiation reflected/emitted by an object. The electromagnetic radiation is received by a sensor on an RS platform (towers/cranes at the ground level, helicopters/aircraft at the aerial level, and space shuttles/satellites at the space-borne level) and is converted into a signal that can be recorded and displayed in different formats: optical, infrared, radar, microwave, acoustic, and visual, according to the elaborative purpose. Different RS systems have been proposed, corresponding to each data source type [1,2,3,4].
In this paper, as a data source, we considered the RS visual images employed in RS applications (RSAs), referred to in the following simply as RS images. Specifically, we were interested in processing visual information in the same way as the human visual system (HVS), i.e., elaborating the information vector of visible light, a part of the electromagnetic spectrum, according to human perceptual laws and capabilities. In this framework, we were concerned with the notion of the scale representing, as reported in [5,6,7,8,9], “the window of visual perception and the observation capability reflecting knowledge limitation through which a visible phenomenon can be viewed and analyzed”. Indeed, since the objects in RS images usually have different scales, one of the most critical tasks is effectively resizing remotely sensed images by preserving the visual information. This task, usually termed image resizing (IR) for RSAs, has attracted huge interest and become a hot research topic, since IR can imply a scale effect such as constraining the accuracy and efficiency of RSAs.
Over the past few decades, the IR problem has been extensively studied [10,11,12,13]. It is still a popular research field distinguished by many applications in various domains, including RSAs [14,15,16,17,18]. IR can be carried out in an up or down direction, respectively denoted as upscaling and downscaling. Upscaling is a refinement process in which the size of the low-resolution (LR) input image is increased to regain the high-resolution (HR) target image. Conversely, downscaling is a compression process by which the size of the HR input image is reduced to recover the LR target image. In the literature, downscaling and upscaling are often considered separately, so most existing methods specialize in only one direction, sometimes for a limited range of scaling factors. IR methods can be evaluated in supervised or unsupervised mode, depending on whether a target image is available.
Traditionally, IR methods are classified into two categories: non-adaptive and adaptive [19,20,21]. In the first category, including well-known interpolation methods, all image pixels are processed equally. In the second category, including machine learning (ML) methods, suitable pixel changes are selectively arranged to optimize the resized image quality. Usually, non-adaptive methods [22,23] present blurring or artifacts, while adaptive methods [24,25] are more expensive, provide superior visual quality, and keep high-frequency components. In particular, ML methods ensure high-quality results and require widespread learning based on many labeled training images and parameters.
From a methodological point of view, the above considerations also hold for IR methods specifically designed for RSAs [26,27,28,29,30,31,32,33,34,35]. However, most of these methods concern either upscaling [26,27,28,29,30,31] or, to a lesser extent, downscaling [32,33,34,35], despite both being necessary and having equal levels of applicability (see Section 2.1). In addition, the number of IR methods for RSAs that can perform both upscaling and downscaling is very low.
Overall, researchers have developed a significant number of IR methods over the years, although a fair comparison of competing methods promoting reproducible research is missing. To fill this gap and promote the development of new IR methods and their experimental analysis, we proposed a useful framework for studying, developing, and comparing such methods. The framework was conceived to apply this analysis to multiple datasets and to evaluate quantitatively and qualitatively the performance of each method in terms of image quality and statistical measures. In its current form, the framework was designed to consider some IR methods not specifically proposed for RSAs (see Section 2.1), with the intent of evaluating them for this specific application area. However, the framework was made open-access, so that all authors who wish to make their code available can see their methods included and evaluated. Beyond being useful in ranking the considered benchmark methods, the framework is a valuable tool for making design choices and comparing the performance of IR methods. We are confident that this framework holds the potential to bring significant benefits to research endeavors in IR/RSAs.
The framework was instantiated as a Matlab package made freely available on GitHub to support experimental work within the field. A peculiarity of the proposed framework is that it can be used for any type of color image and can be generalized by including IR methods for 3D images. For more rationale and technical background details, see Section 2.2.
To our knowledge, the proposed framework represents a novelty within IR methods for RSAs. Precisely, the main contributions of this paper can be summarized as follows:
  • A platform for testing and evaluating IR methods for RSAs applied to multiple RS image datasets was made publicly available. It provided a framework where the performance of each method could be evaluated in terms of image quality and statistical measures.
  • Two RS image datasets with suitable features for testing are provided (see below and Section 2.2.2).
  • The ranking of the benchmark methods and the evaluation of their performance for RSAs were carried out.
  • Openness and robustness were guaranteed, since it was possible to include other IR methods and evaluate their performance qualitatively and quantitatively.
Using the framework, we analyzed six IR methods, briefly denoted as BIC [36], DPID [37], L0 [38], LCI [20], VPI [21], and SCN [39] (see refMethods, Section 2.1). These methods were selected to provide a set of methodologically representative methods. According to the above remarks, an extensive review of all IR methods was outside of this paper’s scope.
Experiments were carried out on six datasets in total (see Section 2.2.2). Four datasets are extensively employed in RSAs, namely AID_cat [40], NWPU VHR-10 [41], UCAS_AOD [42], and UCMerced_LandUse [43]. The remaining two datasets, available on Github, comprised images that we extracted employing Google Earth, namely GE100-DVD and GE100-HDTV (see Section 2.2.2). We quantitatively evaluated the performance of each method in terms of the full-reference quality assessment (FRQA) and no-reference quality assessment (NRQA) measures, respectively, in supervised and unsupervised mode (see Section 2.2.1).
The proposed open framework provided the possibility of tuning and evaluating IR methods to obtain relevant results in terms of image quality and statistical measures. The experimental results confirmed the performance trends already highlighted in [21] for RSAs, showing significant statistical differences among the various IR benchmark methods, as well as the visual quality they could attain in RS image processing. The deep analysis of the results led to the conclusion that for RSAs, the quality measure and CPU time findings confirmed that, on average, VPI and LCI presented adequate and competitive performances, with experimental values generally better and/or more stable than those of the benchmark methods. Moreover, VPI and LCI, besides being much faster than the methods performing only downscaling or upscaling, had no implementation limitations and could be run in an acceptable CPU time on high image sizes and for large scale factors (see Section 4).
The paper is organized as follows: Section 2 describes the benchmark methods (see Section 2.1), the rationale and technical background (see Section 2.2), and the proposed framework (see Section 2.3). Section 3 reports a comprehensive comparison in quantitative and qualitative terms over multiple datasets. Section 4 provides a discussion and draws some conclusions.

2. Materials and Methods

This section is organized into several subsections, as follows: First, a short review of several methods available in the literature, performing the function of benchmark methods, is briefly presented (see Section 2.1). This subsection will help the reader to fully understand the main guidelines followed in the literature and the benchmark methods’ main features. Following this, the rationale and technical background necessary to realize the proposed framework are provided; see Section 2.2. This section also contains a description of the quality measures (see Section 2.2.1) and benchmark datasets (see Section 2.2.2) employed herein. Finally, the proposed framework is described, outlining its peculiarities, in Section 2.3. All materials, including the new benchmark datasets and the framework, are publicly available to the reader (refer to the GitHub link in the section named Data Availability Statement).

2.1. Benchmark Methods

In this subsection, we outline the benchmark methods considered in the validation phase (see Section 3), denoted as BIC [36], DPID [37], L0 [38], LCI [20], VPI [21], and SCN [39]. Except for BIC, LCI, and VPI, these methods were developed and tested by considering the IR problem in one scaling mode, i.e., the downscaling (DPID and L0) or upscaling mode (SCN). In the following, when necessary (for LCI and VPI), we distinguish the upscaling and downscaling modes using the notations u-LCI/u-VPI and d-LCI/d-VPI, respectively.
  • BIC (downscaling/upscaling)
    Bicubic interpolation (BIC), the most popular IR method, employs piecewise bicubic interpolation. The value of each final pixel is a weighted average of the pixel values in the nearest 4 × 4 neighborhood. BIC produces sharper images than other non-adaptive classical methods, such as bilinear and nearest neighbors, offering a comparatively favorable image quality and processing time ratio. BIC was designed to perform scaling by selecting the size of the resized image or the scale factor.
  • DPID (downscaling)
    The detail-preserving image downscaling (DPID) method employs adaptive low-pass filtering and a Laplacian edge detector to approximate the HVS behavior. The idea is to preserve details in the downscaling process by assigning larger filter weights to pixels that differ more from their local neighborhood. DPID was designed to perform scaling by selecting the size of the resized image.
  • L 0 (downscaling)
    The L0-regularized image downscaling method (L0) is an optimization framework for image downscaling. It focuses on two critical issues: salient feature preservation and downscaled image construction. For this purpose, it introduces two L0-regularized priors. The first, based on the gradient ratio, allows for preserving the most salient edges and the visual perceptual properties of the original image. The second optimizes the downscaled image with the guidance of the original image, avoiding undesirable artifacts. The two L0-regularized priors are applied iteratively until the objective function is verified. L0 was designed to perform scaling by selecting the scale factor.
  • LCI (downscaling/upscaling)
    The Lagrange–Chebychev interpolation (LCI) method falls into the class of interpolation methods. Usually, interpolation methods are based on the piecewise interpolation of the initial pixels, and they traditionally use uniform grids of nodes. On the contrary, in LCI, the input image is globally approximated by employing the bivariate Lagrange interpolating polynomial at a suitable grid of first-kind Chebyshev zeros. LCI was designed to perform scaling by selecting the size of the resized image or the scale factor.
  • VPI (downscaling/upscaling)
    The VPI method generalized to some extent the previous LCI method. It employs an interpolation polynomial [44] based on an adjustable de la Vallée–Poussin (VP)-type filter. The resized image is suitably selected by modulating a free parameter and fixing the number of interpolation nodes. VPI was designed to perform scaling by selecting the size of the resized image or the scale factor.
  • SCN (upscaling)
    The sparse-coding-based network (SCN) method adopts a neural network based on sparse coding, trained in a cascaded structure from end to end. It introduces some improvements in terms of both recovery accuracy and human perception by employing a CNN (convolutional neural network) model. SCNs were designed to perform scaling by selecting the scale factor.
In Table 1, the main features of the benchmark methods are reported. Note that various other datasets, often employed in other image analysis tasks (e.g., color quantization [45,46,47,48,49,50,51] and image segmentation [52,53,54,55]) were employed in [21] to evaluate the performance of the benchmark methods. Moreover, some of the benchmark methods had limitations that were either overcome without altering the method itself too much or resulted in their inability to be used in some experiments (see Section 2.3 and Section 3).

2.2. Rationale and Technical Background

This section aims to present the rationale and the technical background on which the study was based, highlighting the evaluation process’s main constraints, challenges, and open problems. Further, we introduce the mandatory problem of the method’s effectiveness for RSAs. These aspects are listed below.
  • An adequate benchmark dataset suitable for testing IR methods is lacking in general. Indeed, experiments have generally been conducted on public datasets not designed for IR assessment, since the employed datasets, although freely available, do not contain both input and target images. This has significantly limited the quantitative evaluation process in supervised mode and prevented a fair comparison. To our knowledge, DIV2k (DIVerse 2k) is the only dataset containing both kinds of images [56]. For a performance evaluation of the benchmark methods on DIV2k, see [21]. This research gap is even more prominent for RS images, due to the nature of their application.
  • Since the performance of a method on a single dataset reflects its bias in relation to that dataset, running the same method on different datasets usually produces remarkably different experimental results. Thus, an adequate evaluation process should be based on multiple datasets.
  • Performance assessments performed on an empirical basis do not provide a fair comparison. In addition, a correct experimental analysis should be statistically sound and reliable. Thus, an in-depth statistical and numerical evaluation is essential.
  • A benchmark framework for the IR assessment of real-world and RS images is missing in the literature. In particular, as mentioned in Section 1, this research gap has a greater impact in the case of RSAs due to the crucial role of IR in relation to the acquisition sensor and factors connected to weather conditions, lighting, and the distance between the satellite platform and the ground targets.
As mentioned in Section 1, due to the importance of a fair method comparison and promoting reproducible research, we proposed a useful open framework for studying, developing, and comparing benchmark methods. This framework allowed us to address issues 1–3, considering the IR problem in relation to RSAs and extending the analysis performed in [21]. To assess the specific case of RS, in the validation process, we used some of the RS image datasets that are commonly employed in the literature, and we also generated a specific RS image dataset with features more suitable to quantitative analysis (see Section 2.2.2). The framework was employed here to assess IR methods for RS images, but it could be used for any type of color image [57] and generalized for 3D images [58].

2.2.1. Quality Metrics

  • Supervised quality measures
    As usual, when a target image was available, we quantitatively evaluated the performance of each method in terms of the following full-reference quality assessment (FRQA) measures that provided a “dissimilarity rate” between the target resized image and the output resized image: the peak signal-to-noise ratio (PSNR) and the structural similarity index measure (SSIM). The definition of PSNR is based on the definition of the mean squared error (MSE) between two images and extended to color digital images [59] following two different methods [20,21,60]. The first method is based on the properties of the human eye, which is very sensitive to luma information. Consequently, the PSNR for color images is computed by converting the image to the color space YCbCr; separating the intensity Y (luma) channel, which represents a weighted average of the components R, G, and B; and considering the PSNR only for the luma component according to its definition for a single component. In the second method, the PSNR is the average PSNR computed for each image channel. In our experiments, the PSNR was calculated using both of these methods. However, since the use of the first or the second method did not produce a significant difference, for brevity, we report only the values obtained by the first method in this paper. A greater PSNR value (in decibels) indicates better image quality.
    For an RGB color image, the SSIM is computed by converting it to the color space YCbCr and applying its definition to the intensity Y channel [61]. The resultant SSIM index is a decimal value between 1 and 1, where 0 indicates no similarity, 1 indicates perfect similarity, and 1 indicates perfect anti-correlation. More details can be found in [20,21,60].
  • Unsupervised quality measures
    When the target image was not available, we quantitatively evaluated the performance of each method in terms of the following no-reference quality assessment (NRQA) measures: the Natural Image Quality Evaluator (NIQE) [62], Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) [63,64], and Perception-based Image Quality Evaluator (PIQE) [65,66], using a default model computed from images of natural scenes.
    The NIQE involves constructing a quality-aware collection of statistical features based on a simple and successful space-domain natural scene statistic (NSS) model. These features are derived from a corpus of natural and undistorted images and are modeled as multidimensional Gaussian distributions. The NIQE measures the distance between the NSS-based features calculated from the image under consideration to the features obtained from an image database used to train the model.
    The BRISQUE does not calculate the distortion-specific features (e.g., ringing, blur, or blocking). It uses the scene statistics of locally normalized luminance coefficients to quantify possible losses of “naturalness” in the image due to distortions.
    The PIQE assesses distortion for blocks and determines the local variance of perceptibly distorted blocks to compute the image quality.
    The output results of the three functions are all within the range of [0, 100], where the lower the score, the higher the perceptual quality.

2.2.2. Benchmark Datasets

The multidataset analysis included four datasets widely utilized in RSAs and possessing different features, namely AID [40], NWPU VHR-10 [41], UCAS_AOD [42], and UCMerced_LandUse [43]. Moreover, we employed two datasets comprising images we extracted from Google Earth, namely GDVD and GHDTV. All datasets, representing 6850 color images in total, are briefly described in the following list.
  • AID
    The Aerial Image Dataset (AID), proposed in [67] and available at [40], was designed for method performance evaluation using aerial scene images. It contains 30 different scene classes, or categories (“airport”, “bare land”, “baseball field”, “beach”, “bridge”,“center”, “church”, “commercial”, “dense residential”, “desert”, “farmland”, “forest”, “industrial”, “meadow”, “medium residential”, “mountain, park”, “‘parking”, “playground”, “pond”, “port”, “railway station”, “resort”,“river”, “school”, “sparse residential”, “square”, “stadium”, “storage tanks”, and “viaduct”) and about 200/400 samples with sizes of 600 × 600 in each class. The images were collected from Google Earth and post-processed as RGB renderings from the original aerial images. They images are multisource, since, in Google Earth, they were acquired from different remote imaging sensors. Moreover, each class’s sample images were carefully chosen from several countries and regions worldwide, mainly in the United States, China, England, France, Italy, Japan, and Germany. These images were captured at different times and seasons under disparate imaging conditions, with the aim of increasing the data’s intra-class diversity. The images of the categories “‘beach”, “forest”, “parking”, and “sparse residential“ were considered altogether and are denoted as AID_cat in this paper. Note that these images and the images belonging to the same categories in UCAS_AOD were also considered altogether in Section 3.2.3.
  • GEDVD and GHDTV
    Google Earth 100 Images—DVD (GDVD) and Google Earth 100 Images—HDTV (GHDTV) are datasets included with the proposed framework and publicly available for method performance evaluation using Google Earth aerial scene images. The GDVD and GHDTV datasets each contain 100 images generated by collecting the same scene in the two formats: 852 × 480 (DVD) and 1920 × 1080 (HDTV). These two size formats were chosen based on the following considerations: Firstly, these are standard formats that are widely used in practice, and, in particular, the HDTV format was large enough to allow scaling operations to be performed at high scaling factors. Secondly, each image dimension was a multiple of 2, 3, or 4, so resizing operations could be performed with all benchmark methods. Thirdly, since the corresponding images of the two datasets were acquired from the same scene with different resolutions for a specific, non-integer scale factor, these datasets could be considered interchangeably as containing the target and input images. Each image in the two datasets presents a diversity of objects and reliable quality.
  • NWPU VHR-10
    The image dataset NWPU VHR-10 (NWV) proposed in [68,69,70] is publicly available at [41] for research purposes only. It contains images with geospatial objects belonging to the following ten classes: airplanes, ships, storage tanks, baseball diamonds, tennis courts, basketball courts, ground track fields, harbors, bridges, and vehicles. This dataset contains in total 800 very-high-resolution (VHR) remote sensing color images, with 715 color images acquired from Google Earth and 85 pan-sharpened color infrared images from Vaihingen data. The images of this dataset were originally divided into four different sets: a “positive image set” containing 150 images, a “negative image set” containing 150 images, a “testing set” containing 350 images, and an “optimizing set” containing 150 images. The images of this dataset were considered altogether in this paper.
  • UCAS_AOD
    The image dataset UCAS_AOD (UCA) proposed in [71] (available at [42]) contains RS aerial color images collected from Google Earth, including two kinds of targets, automobile and aircraft, and negative background samples. The images of this dataset were considered both altogether and divided into certain categories (see Section 3.2.3) in this paper.
  • UCMerced_LandUse
    The image dataset UCMerced_LandUse (UCML) proposed in [72] (available at [43]) for research purposes, contains 21 classes of land use images. Each class contains 100 images with a size of 256 × 256 manually extracted from larger images of the USGS National Map Urban Area Imagery collection, framing various urban areas around the country. The images of this dataset were considered altogether in this paper.
In Table 2, the main features of all datasets are reported. Each stand-alone dataset contained images that could be considered the target image starting from a given input image.
In order to test the benchmark methods in supervised mode, we needed to generate the input image to apply the chosen resizing method. To this end, we followed a practice established in the literature and often adopted by other authors, i.e., we rescaled the target image by BIC and used it as an input image in most cases in the framework validation (see Section 3.1). However, in Section 3.1.3, we also used the benchmark methods to generate the input image with the aim of studying the input image dependency. To discriminate how the input images were generated, we include the acronym of the resizing method used for their generation when referring to the images. For instance, “BIC input image” indicates the input image generated by BIC.
In unsupervised mode, besides the above datasets, we also tested the benchmark methods according to four categories: beach, forest, parking, and sparse residential. To this end, we fused the corresponding category images of AID and UCA to generate the subsets, indicated in the following as AU_Beach (500 images), AU_Forest (350 images), AU_Parking (390 images), and AU_SparseRes (400 images) (see Section 3.2.3).

2.3. Proposed Framework

As stated above, the proposed framework allowed us to test each benchmark method on any set of input images in two modes: “supervised” and “unsupervised”, depending on the availability of a target image. Three image folders were used: the folder of input images (mandatory), named “input_image”; the folder of output IR images (optional), named “output_image”; and the folder of ground-truth images (mandatory in supervised mode but not required in unsupervised mode), named “GT_image”. Note that the images in each folder should have the same graphic format. Moreover, in supervised mode, the GT_image folder should include ground-truth images whose file names are the same as those of the input images in the input_image folder.
Preliminarily, the user has to set the scale factor (Scale) to a real value not equal to 1. Then, the user has to complete an initialization step consisting of the following settings:
  • Supervised ‖ unsupervised;
  • Upscaling ‖ downscaling;
  • Input image format (.png ‖ .tif ‖ .jpg ‖ .bmp);
  • Benchmark method (BIC ‖ SCN ‖ LCI ‖ VPI for upscaling; BIC ‖ DPID ‖ L0‖ LCI ‖ VPI for downscaling);
  • Ground-truth image format (.png ‖ .tif ‖ .jpg ‖ .bmp);
  • Image saving option (Y ‖ N);
  • Image showing option (Y ‖ N).
Note that if the downscaling option is selected, Scale is automatically updated to 1/Scale. The initialization step is managed through dialog boxes, as shown in Figure 1.
The dialog box for selecting the graphic format of the input and the GT image remains on hold until the user selects the correct file extension for the files included in their respective folders. A comprehensive table is generated at the end of the initialization step, according to the selected mode (supervised or unsupervised). The selected benchmark method is applied to each image in the folder input_image during a run using the default parameter values. Successively, the computed quality measures and CPU time are stored in the table. If the image saving option is selected (i.e., ‘Y’ is chosen), the corresponding resized image is stored in the output_image folder with the same file name. Similarly, if the image showing option is selected (i.e., ‘Y’ is chosen), the corresponding resized image is shown on the screen. In the end, the average CPU time and image quality measures are also computed and stored in the table. Then, the table is saved as an .xls file in the directory “output_image”. For more details, see the description reported as pseudo-code in Algorithm 1.
In unsupervised mode, each benchmark method, DPID excluded, was implemented by selecting the scale factor, and the resulting resized image was consequently computed. Since DPID was designed to perform scaling by selecting the size of the resized image, based on the selected scale factor, we computed the size of the resized image, which was used as a parameter for DPID.
In supervised mode, almost all benchmark methods were implemented by selecting the size of the resized image equal to the size of the ground-truth image taken from the folder GT_image. Indeed, in supervised mode this was possible for BIC, DPID, LCI, and VPI, since they were designed with the possibility of performing scaling by selecting the size of the resized image or the scale factor.
Since this did not apply to the SCN method and L0, we had to introduce some minor variations to make them compliant with our framework. Specifically, for the SCN method, to perform the supervised resizing by indicating the size, we introduced minor algorithmic changes to the original code and modified the type of input parameters without significantly affecting the nature and the core of the SCN. The changes consisted in computing the scale factor corresponding to the size of the ground-truth image and then implementing the SCN with this scale factor as a parameter. However, these minor changes were not sufficient to remove three computational limitations of the SCN, which remained impractical, since a complete rewriting of the SCN method (outside our study’s scope) would be necessary. The first limitation involved the size of the resized image being incorrectly computed for some or all of the input images; for example, for the UCA dataset, if the desired resized image was equal to 1280 × 659, starting from an input image generated by BIC with a size of 640 × 330 and considering a scale factor equal to 2, the computed size of the resized image would be 1280 × 660. This computational limitation is indicated in the following as “not computable” using the notation “–”. The second SCN computational limitation was related to the resizing percentage, which could not correspond to a non-integer scale factor—for instance, in the case of supervised upscaling with input images from GDVD for the ground truths of GHDTV corresponding to a scale factor equal to about 2.253. This is indicated in the following as “not available” using the abbreviation “n.a.”. The third SCN computational limitation pertained to scale factors greater than or equal to 3 for larger or more numerous input images, as the available demo code of the SCN caused Matlab to run out of memory. This occurrence is indicated in the following as “out of memory” using the notation “OOM”.
Algorithm 1 Framework
  • I n s e r t i o n S c a l e v a l u e
  • if  S c a l e 1  then
  •       I n i t i a l i z e :
  •       S u p e r v i s e d U n s u p e r v i s e d
  •       U p s c a l i n g D o w n s c a l i n g
  •       I n p u t I m a g e F o r m a t
  •       I m a g e s a v i n g o p t i o n
  •       I m a g e s h o w i n g o p t i o n
  •      if  S u p e r v i s e d  then
  •           G T _ f o r m a t S e l e c t G r o u n d T r u t h I m a g e F o r m a t
  •      end if
  •       G e n e r a t e a n a p p r o p r i a t e T a b l e
  •       M E T H O D S e l e c t U p s c a l i n g D o w n s c a l i n g R e s i z i n g M e t h o d
  •       N C o m p u t e t h e n u m b e r o f i n p u t i m a g e s
  •      for i = 1 to N do
  •           R e a d I n p u t i m a g e ( I i )
  •           [ T i , R i ] M E T H O D ( I i ) { w h e r e R i r e s i z e d i m a g e , T i C P U t i m e }
  •          if  S u p e r v i s e d  then
  •                R e a d t h e c o r r e s p o n d i n g G T i m a g e ( G T i ) w i t h G T _ f o r m a t
  •          end if
  •           C o m p u t e q u a l i t y m e a s u r e s f o r R i . T h e n s t o r e t h e m a n d T i i n t h e T a b l e
  •          if  I m a g e s a v i n g o p t i o n = Y  then
  •                S a v e ( R i )
  •          end if
  •          if  I m a g e s h o w i n g o p t i o n = Y  then
  •                S h o w ( R i )
  •          end if
  •      end for
  •       C o m p u t e a v e r a g e q u a l i t y m e a s u r e s a n d C P U t i m e . T h e n s t o r e t h e m i n t h e T a b l e
  •       S a v e T a b l e a s . x l s f i l e
  • else
  •       W a r n i n g : N o S c a l i n g m e t h o d h a s b e e n a p p l i e d !
  • end if
It was not possible to implement L0 because it is only available as a Matlab p-code. Using L0 to perform resizing by selecting the size of the resized image, we computed the scale factor corresponding to the size of the ground-truth image, which was passed on as a parameter for L0. However, in this way, it was not possible to avoid an error in the size calculation when each size dimension of the resized image was not equal to the product of the scaling factor and the corresponding size dimension of the input image or when different scale factors had to be considered for each dimension. In these cases, in the framework, the user is simply notified that the calculation is impossible. Note that this L0 computational limitation did not affect the experimental results presented herein (see Section 3), since L0 was used only for downscaling with an equal scale factor for each size dimension on images prior to zooming using the same scale as other benchmark methods, so that we did not have any problem in performing resizing.
The framework was made open-access so that other benchmark methods can be added. Thus, we invite other authors to make their method code available to expand the framework’s capabilities for a fair comparison. The proposed framework was instantiated as a Matlab package that is freely available on GitHub. It was run on a computer with an Intel Core i7 3770K CPU configuration @350 GHz and Matlab version 2022b.

3. Experimental Results and Discussion

This section reports the comprehensive performance assessment of the benchmark methods outlined in Table 1 over multiple datasets. We considered BIC, d-LCI, L0, DPID, and d-VPI as downscaling benchmark methods, while BIC, SCN, u-LCI, and u-VPI were considered as upscaling benchmark methods. Indeed, although DPID and L0 (SCN) could also be applied in upscaling (downscaling) mode, we did not focus on this unplanned comparison to avoid an incorrect experimental evaluation. Note that BIC was implemented using the built-in Matlab function imresize with the bicubic option. For the remaining methods, we employed the publicly available source codes in a common language (Matlab). These codes were run with the default parameter settings.
We tested the benchmark methods for scale factors varying from 2 to large values in both supervised and unsupervised mode for upscaling/downscaling. However, for brevity, in this paper, we limit ourselves to showing the results for the scale factors of 2, 3, and 4.
In the following, Section 3.1 and Section 3.2 are devoted to evaluating the quantitative results for the supervised and unsupervised modes, respectively. Moreover, for each type of quantitative evaluation, we distinguish the upscaling and downscaling cases. In addition, in Section 3.3, the trend of CPU time is investigated. Section 3.4 concerns the qualitative results for both the supervised/unsupervised modes and upscaling/downscaling cases. In particular, performance examples are given for different scale factors and modes. Finally, in Section 3.5, conclusive global assessments are presented.

3.1. Supervised Quantitative Evaluation

In supervised mode, for the quantitative evaluation of both upscaling and downscaling methods, we show the full-reference visual quality measures PSNR and SSIM (see Section 2.2.1) for all benchmark methods where the ground-truth image had a size corresponding to the scale factors s = 2, 3, 4.
Since the target image was necessary to estimate these quality measures, we employed the benchmark methods for both upscaling and downscaling by mainly modifying their input parameters so that the output image’s size was automatically computed after the ground-truth image was considered and its dimensions computed. For LCI, VPI, and BIC, we used the version of the method where the input parameters were the image dimensions of the resized image.
As mentioned above, we took the target images from the datasets and applied BIC in upscaling (or downscaling) mode to them in order to generate the input images for the downscaling (or upscaling) benchmark method. We refer to these images as “BIC input images”. This was performed for all datasets.
In Section 3.1.1 and Section 3.1.2 we interpret the results respectively for supervised downscaling and upscaling. In addition, as presented in Section 3.1.3, we studied the impact of how the input images were generated. For this purpose, we repeated the quantitative analysis and computed the average PSNR and SSIM values for the supervised benchmark methods while varying the unsupervised scaling method used to generate the input images from the target images in the dataset.

3.1.1. Supervised Downscaling

Table 3 and Figure 2 show the average performance of supervised downscaling methods with BIC input images for a ground-truth image with a size corresponding to the scale factors s = 2, 3, 4. The results confirmed the trend already observed in [21]. Specifically, for even scale factors ( s = 2 , 4 ), d-VPI always supplied much higher performance values than DPID and L 0 , which always presented the lowest quality measures. For odd scale factors, d-VPI performed similarly to d-LCI, reaching the optimal quality measures for input images generated by BIC [21]. This trend was also detectable from the boxplots in Figure 2. The trend could also be observed for higher scale factors. To provide further insights, the results obtained for scale factors s = 6, 8, 10 on the GHDTV dataset with BIC input images are reported in Table 4.
Table 3. Average performance of supervised downscaling methods with BIC input images.
Table 3. Average performance of supervised downscaling methods with BIC input images.
:2:3:4
MSEPSNRSSIMTIMEMSEPSNRSSIMTIMEMSEPSNRSSIMTIME
AID_cat
BIC4.01244.8440.9940.0113.68045.6690.9950.0183.74345.5300.9940.033
DPID2.76045.6020.99722. 9522.68545.6840.99735.6863.08245.1750.99656.004
L09.58539.4580.9893.40210.89439.0930.9876.9551.51547.1580.99812.930
d-LCI0.20555.1210.9990.1120.00Inf1.0000.1630.14156.9031.0000.276
d-VPI0.15256.6761.0001.9040.00Inf1.0000.1630.05361.7871.0005.017
GDVD
BIC10.41638.6360.9890.0129.71138.9550.9900.0189.84338.8950.9900.036
DPID5.05541.5910.99620.0715.10541.5650.99632.6715.91340.9350.99549.898
L017.57336.1690.9883.552170.35326.4530.8827.9452.62544.3460.99814.063
d-LCI0.21254.9030.9990.1070.00Inf1.0000.1770.14956.5131.0000.277
d-VPI0.16456.0651.0001.9590.00Inf1.0000.1770.05960.5441.0004.912
GHDTV
BIC3.16744.3220.9960.0342.90044.8290.9970.0442.94544.7460.9960.067
DPID2.68144.4490.997101.2972.69744.4110.997163.5773.10743.8060.997253.606
L010.95638.3510.99219.6071.48246.8540.99979.9471.48246.8540.99979.640
d-LCI0.18555.4890.9990.4810.00Inf1.0000.8630.12457.3611.0001.323
d-VPI0.12757.1601.0009.8080.00Inf1.0000.8630.04162.2621.00026.009
NWV
BIC5.01543.2180.9920.0184.64143.6950.9930.0314.70943.6180.9930.050
DPID2.46345.5760.99635.5962.35945.7630.99757.4792.69445.2140.99689.015
L07.98940.0580.9905.25564.24721.4450.64112.8261.30447.8700.99822.503
d-LCI0.18955.4520.9990.3080.00Inf1.0000.3020.12557.3871.0000.443
d-VPI0.13457.0651.0003.4230.00Inf1.0000.3020.04362.2251.0007.852
UCA
BIC7.39440.9120.9930.0196.82541.3300.9940.0266.94341.2460.9940.042
DPID3.55943.4880.99647.7623.25743.8870.99777.2203.45243.3661.00010.966
L09.54739.0550.99010.74312.59537.9110.98722.9791.69446.4350.99843.939
d-LCI0.23454.5480.9990.2440.00Inf1.0000.4510.13357.0191.0000.729
d-VPI0.18055.7960.9995.3070.00Inf1.0000.0250.05760.9361.00014.452
UCML
BIC6.97742.9310.9920.0036.43043.4220.9930.0036.55243.3390.9920.006
DPID3.48143.6850.9804.2933.31743.8070.9806.6723.86543.2350.98010.553
L013.75537.8320.9870.56549.05732.7470.9581.2452.60645.1080.9972.194
d-LCI0.25154.8380.9990.0140.00Inf1.0000.0240.15956.3841.0000.035
d-VPI0.20156.1481.0000.2580.00Inf1.0000.0240.06861.0271.0000.612
Figure 2. Boxplots derived from Table 3.
Figure 2. Boxplots derived from Table 3.
Remotesensing 15 04039 g002
Table 4. Average performance of supervised downscaling methods on the GHDTV dataset with BIC input images for other scale factors (OOM indicates “out of memory”).
Table 4. Average performance of supervised downscaling methods on the GHDTV dataset with BIC input images for other scale factors (OOM indicates “out of memory”).
:6:8:10
MSEPSNRSSIMTIMEMSEPSNRSSIMTIMEMSEPSNRSSIMTIME
BIC2.93244.7770.9960.2682.92844.7910.9960.4542.92544.8020.9970.662
DPID3.66543.0940.997675.2133.97942.7400.9971057.8204.18042.5270.9961617.292
L02.00845.7370.998172.8892.53644.8810.997359.239OOMOOMOOMOOM
d-LCI0.08459.1371.0003.7460.05860.8921.0006.3820.03962.6201.00010.062
d-VPI0.01866.0181.00071.3300.00969.1151.000118.6300.00572.0981.000172.154

3.1.2. Supervised Upscaling

Table 5 and Figure 3 show the average performance of supervised upscaling methods with BIC input images for a ground-truth image with a size corresponding to the scale factors s = 2, 3, and 4. The results confirmed the trend already observed in [21]. Specifically, SCN produced the highest quality values in the case of BIC input images, followed by u-VPI, u-LCI, and BIC. This trend was also detectable from the boxplots in Figure 3. The trend could also be observed for higher scale factors. To provide further insights, the results obtained for scale factors s = 6, 8, and 10 on the GHDTV dataset with BIC input images are reported in Table 6.

3.1.3. Supervised Input Image Dependency

We conducted the following two experiments on GEHDTV and GDVD to test the supervised input image dependency. We selected these datasets due to their features and size.
Supervised Input Image Dependency–Experiment 1
We changed the input image for both downscaling and upscaling, generating it using the other methods in unsupervised mode. Indeed, for supervised downscaling, the input HR images were generated by the unsupervised upscaling methods BIC, SCN, u-LCI, and u-VPI. In contrast, for supervised upscaling, the input LR images were created using the unsupervised BIC, L0, DPID, d-LCI, and d-VPI methods. The input-free parameter of unsupervised u-VPI and d-VPI, employed to generate the input image, was set to a prefixed value equal to 0.5.
In Table 7 and Table 8 (Table 9 and Table 10), the average performance of supervised downscaling (upscaling) methods with different input images is shown for scale factors s = 2, 3, and 4.
Table 5. Average performance of supervised upscaling methods with BIC input images (– indicates “not computable”).
Table 5. Average performance of supervised upscaling methods with BIC input images (– indicates “not computable”).
×2×3×4
MSEPSNRSSIMTIMEMSEPSNRSSIMTIMEMSEPSNRSSIMTIME
AID_cat
BIC52.42133.7300.9200.004125.17629.5330.8230.004173.18628.0910.7590.004
SCN34.78135.4030.9431.995104.94130.2350.8514.210141.37128.9880.7942.415
u-LCI42.33634.8660.9320.021115.09329.9650.8330.014162.04028.4620.7660.013
u-VPI41.79134.9180.9340.046113.83830.0070.8360.349160.20628.5070.7700.348
GDVD
BIC110.34128.3230.8850.005226.17525.1490.7790.005287.48724.1100.7180.005
SCN70.70030.3270.9222.001188.60425.9590.8184.160230.67425.0860.7652.457
u-LCI94.94529.0280.8970.027213.26025.4140.7870.019273.37524.3350.7240.015
u-VPI93.86529.0750.9000.545211.54125.4490.7910.438271.38624.3640.7280.375
GHDTV
BIC49.48732.3100.9410.016198.06725.8550.8210.014186.54426.1910.8000.014
SCN28.12034.6710.9637.654205.93925.7130.83222.342136.64927.6240.8429.415
u-LCI38.19433.6290.9510.146194.72325.9420.8240.121171.40026.6310.8060.109
u-VPI37.69033.6820.9533.211192.85825.9810.8282.342169.58626.6760.8102.541
NWV
BIC66.15631.4610.8950.006121.12128.6650.8160.006161.26527.2900.7640.006
SCN
u-LCI62.80931.7670.9010.040118.93728.7990.8190.028159.10927.3820.7650.021
u-VPI62.22031.8070.9030.795117.72728.8420.8220.613157.70727.4200.7680.532
UCA
BIC96.84429.2290.8970.007210.90725.6080.7790.006213.25325.7030.7550.007
SCN
u-LCI92.46329.4370.9030.074212.37725.5820.7790.061208.06425.8440.7570.051
u-VPI91.75829.4680.9051.731209.28025.6440.7821.377206.23725.8820.7621.288
UCML
BIC75.23331.8730.9100.001319.71924.8540.7290.002222.25126.6250.7540.001
SCN44.93833.5980.9220.422170.74827.3140.7770.480
u-LCI63.39532.8810.9210.003327.29424.7660.7270.003207.54827.0010.7610.002
u-VPI62.57332.9340.9230.088319.76324.8650.7280.084205.23027.0440.7650.074
Table 6. Average performance of supervised upscaling methods on the GHDTV dataset with BIC input images for other scale factors.
Table 6. Average performance of supervised upscaling methods on the GHDTV dataset with BIC input images for other scale factors.
×6×8×10
MSEPSNRSSIMTIMEMSEPSNRSSIMTIMEMSEPSNRSSIMTIME
BIC299.35324.0370.7170.021383.71422.9240.6740.021448.80822.2290.6510.020
SCN240.16825.0440.75421.233323.35423.6940.70212.154389.03722.8670.67230.237
d-LCI284.94124.2740.7190.083369.64023.0960.6750.074435.66622.3650.6510.069
d-VPI282.67824.3090.7221.829367.35723.1240.6781.673433.16922.3900.6531.597
Analyzing these tables, it follows that:
(a)
For supervised downscaling (see Table 7 and Table 8) starting from HR input images created by the upscaling methods other than BIC, for even scale factors ( s = 2 , 4 ), d-VPI always produced much higher quality values than DPID and L 0 , , which always presented the lowest performance in qualitative terms. The d-VPI method, followed by d-LCI, obtained the best performance, apart from the case of SCN input images, where BIC showed the better performance, in agreement with the limitations reported in Section 2.3, followed by d-VPI, DPID, d-LCI, and L 0 . For odd scale factors, since d-VPI coincided with d-LCI, these methods attained better quality values in the case of input images created by BIC, u-LCI, or u-VPI. However, for SCN input images, the rating of the methods for even scale factors s = 2 , 4 was as expected, i.e., the best performance was attributed to BIC, followed by d-LCI = d-VPI, DPID, and L 0 , respectively.
(b)
For supervised upscaling (see Table 9 and Table 10), starting from LR input images created by downscaling methods other than BIC, SCN always produced the lowest quality values. The best performance was accomplished by u-VPI, apart from in the upscaling ×2 case with L0 input images, where BIC had slightly higher performance values than u-VPI. Analogously to BIC, u-VPI had a more stable trend with regard to the variations in the input image. The quality values obtained by u-VPI were always higher than those obtained by u-LCI, which preformed better than BIC only for BIC input images.
Figure 3. Boxplots derived from Table 5.
Figure 3. Boxplots derived from Table 5.
Remotesensing 15 04039 g003
Table 7. Average performance results of supervised downscaling methods on GHDTV dataset with input images generated by the BIC and SCN methods.
Table 7. Average performance results of supervised downscaling methods on GHDTV dataset with input images generated by the BIC and SCN methods.
BIC InputSCN Input
MSEPSNRSSIMTIMEMSEPSNRSSIMTIME
GHDTV
:2
BIC3.16744.3220.9960.0340.84449.4400.9990.036
DPID2.68144.4490.997101.2977.54640.3160.993105.20
L010.95638.3510.99219.60723.22835.4230.98119.150
d-LCI0.18555.4890.9990.4816.27541.5120.9930.499
d-VPI0.12757.1601.0009.8080.93649.2770.99910.591
:3
BIC2.90044.8290.9970.0180.94248.9460.9990.046
DPID2.69744.4110.99732.6716.09141.1640.994165.18
L01.48246.8540.9997.94519.01636.1900.98344.421
d-LCI0.00Inf1.0000.1776.75941.2780.9931.274
d-VPI0.00Inf1.0000.1776.65941.2780.9931.274
:4
BIC2.94544.7460.9960.0670.96448.9510.9990.065
DPID3.10743.8060.997253.6066.31540.9690.994263.86
L01.48246.8540.99979.6404.34342.5290.99668.453
d-LCI0.12457.3611.0001.3238.05540.5070.9921.457
d-VPI0.04162.2621.00026.0093.26344.2080.99727.397
Table 8. Average performance results of supervised downscaling methods on GHDTV dataset with input images generated by u-LCI and u-VPI.
Table 8. Average performance results of supervised downscaling methods on GHDTV dataset with input images generated by u-LCI and u-VPI.
u-LCI Inputu-VPI Input
MSEPSNRSSIMTIMEMSEPSNRSSIMTIME
GHDTV
:2
BIC1.56247.2620.9980.0382.65845.0920.9960.038
DPID3.78243.0790.997100.8333.11043.8730.997105.12
L012.00437.9850.99119.77711.54538.1450.99119.326
d-LCI0.07359.6151.0000.5100.07059.7121.0000.517
d-VPI0.01268.7741.00010.6130.02464.5751.00010.625
:3
BIC1.41247.9480.9980.0482.50245.5290.9970.048
DPID3.63743.1950.997168.893.11043.8400.997207.796
L012.27837.8950.99044.8211.74438.0790.99045.137
d-LCI0.00Inf1.0000.8630.000Inf1.0000.865
d-VPI0.00Inf1.0000.8630.000Inf1.0000.865
:4
BIC1.44847.7420.9980.0692.53445.4160.9970.065
DPID4.07342.6970.996258.793.52443.2950.997316.138
L01.32547.3800.99979.6951.42347.0580.99986.846
d-LCI0.05360.9771.0001.330.05360.9741.0001.342
d-VPI0.00280.1191.00026.2200.00178.8201.00026.022
Table 9. Average performance results of supervised upscaling methods on GHDTV dataset with input images generated by BIC, L0, DPID.
Table 9. Average performance results of supervised upscaling methods on GHDTV dataset with input images generated by BIC, L0, DPID.
BIC InputL0 InputDPID Input
MSEPSNRSSIMTIMEMSEPSNRSSIMTIMEMSEPSNRSSIMTIME
GHDTV
×2
BIC49.48732.3100.9410.01653.11931.6990.9490.01646.65932.5140.9500.016
SCN28.12034.6710.9637.654132.38727.7810.9067.34774.59930.2820.9347.316
u-LCI38.19433.6290.9510.14679.67230.0290.9170.14353.51831.9860.9360.145
u-VPI37.69033.6820.9533.21155.56831.4960.9473.13445.19032.6840.9520.952
×3
BIC198.06725.8550.8210.014321.79723.6720.7720.015117.95628.3020.8790.014
SCN205.93925.7130.83222.342476.26822.0030.71616.311163.01826.8400.86416.358
u-LCI194.72325.9420.8240.121375.02623.0220.7360.117133.96027.8170.8540.120
u-VPI192.85825.9810.8282.342312.07523.8050.7662.634116.39828.4030.8802.638
×4
BIC186.54426.1910.8000.014175.79926.4810.8190.014175.79926.4810.8190.014
SCN136.64927.6240.8429.415226.22025.3660.8079.570226.22025.3660.8079.593
u-LCI171.40026.6310.8060.109197.67626.0330.7880.112197.67626.0330.7880.108
u-VPI169.58626.6760.8102.541173.45026.5870.8202.342173.45026.5870.8202.315
Table 10. Average performance results of supervised upscaling methods on GHDTV dataset with input images generated by d-LCI and d-VPI (– indicates “not computable”).
Table 10. Average performance results of supervised upscaling methods on GHDTV dataset with input images generated by d-LCI and d-VPI (– indicates “not computable”).
d-LCI Inputd-VPI Input
MSEPSNRSSIMTIMEMSEPSNRSSIMTIME
GHDTV
×2
BIC45.60432.7820.9480.01639.01933.4380.9540.016
SCN80.34930.3090.9267.40662.00931.3670.9407.356
u-LCI55.25032.2060.9290.13843.20333.2750.9410.141
u-VPI43.58733.1550.9493.09234.94434.1420.9573.166
×3
BIC242.47425.0060.8110.014129.58328.0450.8740.015
SCN239.08825.3480.82616.157
u-LCI290.64124.2530.7680.121167.49227.0320.8280.116
u-VPI239.57625.0510.8082.713127.75228.1560.8742.629
×4
BIC226.91025.4450.8060.013219.06925.5970.8100.014
SCN420.07822.7270.7429.484397.43922.9720.7519.523
u-LCI292.12424.3890.7480.105280.26824.5690.7530.105
u-VPI220.18025.5750.8072.363213.68625.7080.8112.345
Supervised Input Image Dependency–Experiment 2
Since both GDVD and GHDTV were generated by collecting images of the same scene in two different formats with a scale ratio equal to about 2.253 (see Section 2.2.2), we used the images of the GHDTV dataset as the input for all supervised downscaling processes by considering as ground truth the images of the GDVD dataset. Correspondingly, we used the images of the GDVD dataset as the input for all supervised upscaling processes by considering the images of the GHDTV dataset as the ground truth. In Table 11 (Table 12), the average quality values of supervised downscaling (upscaling) methods with input images from GDVD (GHDTV) and ground-truth images from GHDTV (GDVD) are shown.
Analyzing these tables, it follows that:
(a)
For supervised downscaling starting from the HR input image from GDVD with a scale factor of about s = 2.253, BIC produced moderately better quality values than d-VPI, followed by d-LCI, DPID, and L 0 .
(b)
For supervised upscaling starting from the HR input image from GHDTV with a scale factor of s = 2.253, d-VPI produced better quality values than u-LCI and BIC. At the same time, the SCN method was not able to provide results, since it did not work in supervised mode with a non-integer scale factor.

3.2. Unsupervised Quantitative Evaluation

In unsupervised mode, for the quantitative evaluation, we report the no-reference visual quality measures NIQE, BRISQUE, and PIQE (see Section 2.2.1) for all benchmark methods; the scale factors s = 2, 3, and 4; and both downscaling and upscaling. The framework employed the unsupervised benchmark methods for both upscaling and downscaling, simply using the specified scale factor, since the target image was not necessary to compute these quality measures. In Section 3.2.1 and Section 3.2.2 we interpret the results respectively for unsupervised downscaling and upscaling. In addition, as presented in Section 3.2.3, to study the impact of the image category, we performed the same quantitative analysis and computed the average NIQE, BRISQUE, and PIQE values for the unsupervised benchmark methods on the image sub-datasets selected by category.

3.2.1. Unsupervised Downscaling

Table 13 and Figure 4 show the average performance of unsupervised downscaling methods for the scale factors s = 2, 3, and 4 on all datasets.
The results indicated that the best performance was attributable to BIC, followed by d-VPI, d-LCI, L 0 , and DPID, respectively. The same trend could be observed for higher scale factors as well. To provide further insights, the results obtained for scale factors s = 6, 8, and 10 on the GHDTV dataset are reported in Table 14.

3.2.2. Unsupervised Upscaling

Table 15 and Figure 5 show the average performance of unsupervised upscaling methods for the scale factor s = 2, 3, 4 on all datasets.
The results indicated that the best performance was attributable to SCN, followed by u-VPI, u-LCI, BIC, and L0, respectively. The same trend could be observed for higher scale factors as well. To provide further insights, the results obtained for scale factors s = 6, 8, and 10 on the GHDTV dataset are reported in Table 16.
Table 13. Average performance of unsupervised downscaling methods.
Table 13. Average performance of unsupervised downscaling methods.
:2:3:4
NIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIME
AID_cat
BIC5.14729.24227.0540.0046.02326.74135.6210.00318.88027.08938.5320.003
DPID11.25139.52736.7595.65713.69237.55847.2194.20218.87836.64950.7823.556
L06.73933.35338.5000.8239.48533.88247.5610.82418.88131.30846.3020.835
d-LCI6.33931.00934.0490.02410.48332.48447.2350.01418.88234.16251.0610.013
d-VPI6.18430.60933.5000.03010.48332.48447.2350.01418.88233.78950.4670.015
GDVD
BIC3.91017.69824.6320.0045.88619.98726.1480.0046.27522.48635.6700.004
DPID7.55434.08236.0596.59510.93732.19439.8064.79611.64333.41349.7884.047
L06.76524.80538.1790.9219.91927.36137.6940.92010.28628.89243.3810.976
d-LCI6.71422.96632.6590.02511.40929.97239.9860.01812.95033.46449.5780.015
d-VPI6.50222.57331.8090.03111.80230.63939.7910.02112.44932.84448.9880.017
GHDTV
BIC3.09118.19622.8340.0233.47216.61825.8370.0194.01117.29724.4770.015
DPID4.98334.70930.12734.0476.08931.58936.28024.2837.36731.57136.98520.761
L03.94421.28830.9674.0834.98321.64534.0164.1166.53723.58431.8954.123
d-LCI3.82619.86126.9330.1466.17324.54735.0940.1068.18828.21137.1460.087
d-VPI3.71020.22126.9380.1786.23024.82735.5200.1258.03627.80536.8840.100
NWV
BIC3.37019.48123.7540.0064.33419.83326.5750.0056.06820.75228.7090.004
DPID5.86035.35029.36611.6877.51631.96634.3628.6849.71831.61137.8627.437
L05.03921.07932.7061.3846.28220.34431.5891.3788.32323.32231.4341.363
d-LCI5.00220.54126.9810.0458.07225.94134.7440.02810.24828.34638.0330.023
d-VPI4.85021.23927.7020.0538.20826.58235.6980.03410.43428.72438.4220.026
UCA
BIC3.94022.41023.3500.0104.50621.33527.0620.085.35021.57129.0220.007
DPID6.55038.87228.98718.4597.36634.67934.36916.0839.69434.13139.35411.182
L05.17226.06832.3462.1216.01624.50730.7672.0868.03326.92833.4032.085
d-LCI4.93823.95326.6940.0717.52327.47836.0590.0499.58129.60038.5260.038
d-VPI4.78723.66926.8460.0857.57427.74236.0810.0579.69630.16438.7540.045
UCML
BIC18.87832.33128.1980.00118.87833.95845.8170.00118.87835.67633.5240.001
DPID18.87540.49735.5641.06618.87639.35652.0201.14518.87640.65144.7900.672
L018.88033.30636.6460.14718.87934.63749.1340.14318.87836.51839.7910.143
d-LCI18.87933.33932.1740.00318.87936.28352.4290.00318.87937.80843.6690.003
d-VPI18.87833.16331.9170.00418.87936.13652.3500.00318.87937.57943.3430.003
Table 14. Average performance of unsupervised downscaling methods on GHDTV dataset for other scale factors (OOM indicates “out of memory”).
Table 14. Average performance of unsupervised downscaling methods on GHDTV dataset for other scale factors (OOM indicates “out of memory”).
:6:8:10
NIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIME
BIC3.83330.54225.0550.2643.83630.54325.0220.4473.83330.56924.9600.666
DPID4.45636.62424.248618.4004.44336.61424.3301.099.8244.49336.71724.3501.571.694
L03.49028.95224.340173.9383.57029.50224.938349.645OOMOOMOOMOOM
d-LCI3.21327.22722.8173.9893.22427.17322.8326.3583.23027.15622.8519.302
d-VPI3.23027.20622.8744.2653.23127.18422.8837.0313.23227.17522.8669.769
Figure 4. Boxplots derived from Table 13.
Figure 4. Boxplots derived from Table 13.
Remotesensing 15 04039 g004
Figure 5. Boxplots derived from Table 15.
Figure 5. Boxplots derived from Table 15.
Remotesensing 15 04039 g005
Table 15. Average performance of unsupervised upscaling methods (OOM indicates “out of memory”).
Table 15. Average performance of unsupervised upscaling methods (OOM indicates “out of memory”).
×2×3×4
NIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIME
AID_cat
BIC5.51146.13760.9610.0196.30054.60684.1140.0346.08956.88090.0720.044
SCN4.95441.75944.6496.8105.58046.53468.77832.2265.73749.33878.46436.273
u-LCI5.64045.60249.7490.1126.03353.29277.2990.1706.59057.12185.9280.264
u-VPI5.55847.06958.4540.1516.24554.78282.5130.2476.35457.54889.6190.406
GDVD
BIC4.15938.02445.9760.0235.46451.62075.9540.0395.23255.88685.8860.058
SCN4.05032.10026.6907.4354.61438.88150.57037.6114.87544.59866.38436.491
u-LCI4.84035.24429.6870.1385.12347.80862.4890.2335.56054.04176.4560.353
u-VPI4.14939.34141.7270.1745.26051.70272.9570.3095.36356.30284.9080.496
GHDTV
BIC4.60045.24361.7910.0755.28555.75182.5710.1255.41257.72587.7390.204
SCN4.13038.82746.20241.271OOMOOMOOMOOMOOMOOMOOMOOM
u-LCI4.64143.78251.3630.8175.14754.29177.1371.2995.73457.86583.9612.179
u-VPI4.62746.86658.7261.0535.34056.30680.9721.6595.66758.63986.6322.954
NWV
BIC4.10442.86257.4600.0295.27554.02380.6340.0515.28556.12286.5050.069
SCN3.82934.83446.90010.639OOMOOMOOMOOMOOMOOMOOMOOM
u-LCI4.53839.78351.1500.1785.13352.15274.5520.3025.75257.13382.4170.485
u-VPI4.18144.91856.0060.2285.20454.48079.0370.4025.46756.35185.8670.645
UCA
BIC4.47845.14161.8090.0415.35454.94880.1930.0635.38756.79885.7720.097
SCN4.08038.43552.80417.2234.64346.53969.608132.6444.90549.74774.276133.794
u-LCI4.57342.63355.6530.2985.16653.18973.9640.5455.62757.64778.1010.883
u-VPI4.59046.97560.6330.4865.31155.09078.3950.7885.59257.01084.6801.165
UCML
BIC6.87142.08557.8620.0056.53851.95879.5030.0076.09754.23288.8760.011
SCN6.86238.41141.8711.3496.05242.87160.5305.7315.94746.99272.4355.825
u-LCI7.58041.83243.7050.0186.30749.46968.6550.0306.66553.71581.3400.046
u-VPI6.85642.41354.5930.0246.33751.92277.5160.0386.30155.22488.0850.068
Table 16. Average performance of unsupervised upscaling methods on GHDTV dataset for other scale factors.
Table 16. Average performance of unsupervised upscaling methods on GHDTV dataset for other scale factors.
×6×8×10
NIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIME
BIC6.89159.23689.5230.0256.99259.28297.2060.0226.85861.451100.000.021
SCN5.40248.95078.02321.1355.48151.56585.42912.1745.38653.22487.70030.241
d-LCI7.10757.22682.0190.0887.28158.43595.8960.0807.26560.402100.0000.073
d-VPI6.83658.68689.9080.1357.16359.28297.3900.1277.17762.215100.0000.118

3.2.3. Unsupervised Category Dependency

For unsupervised upscaling (downscaling), we also tested the benchmark methods according to the four categories: “beach”, “forest”, “parking”, and “sparse residential” defined in Section 2.2.2 on the corresponding sub-datasets in AID_cat and UCA. In Table 17 and Table 18, we report the average performance of unsupervised downscaling and upscaling methods, respectively, for the scale factors s = 2, 3, and 4 for each category. The resulting quality measures, shown in these tables, confirmed the same trends detected for unsupervised downscaling and upscaling. On this basis, we could confirm that the methods’ performances did not seem to be dependent on the image category.

3.3. CPU Time Assessment

Since an evaluation in terms of computation time is a crucial element to consider for a quantitative performance assessment, in all Tables, we report the CPU time taken by each benchmark method to produce the resized images for each dataset and scale factor. In Figure 6 and Figure 7, we also show boxplots representing the CPU time derived from Table 3 and Table 5 and from Table 3 and Table 15, respectively. Upon analyzing the results, we found no significant variations with respect to the trend detected in [20,21]. In particular, the results confirmed that BIC required the least CPU time, with u-LCI and u-VPI producing similar values to those of BIC. Much more CPU time was required by SCN for upscaling and DPID and L0 for downscaling, principally on datasets with larger images. Specifically, DPID exhibited the slowest performance, often resulting in impractical and unsustainable processing times. Of course, for each benchmark method, the greater the increase in time, the higher the image resolution.

3.4. Supervised and Unsupervised Qualitative Evaluation for Upscaling and Downscaling

In this subsection, some visual results of the many tests performed herein are presented for scale factors equal to 2, 3, and 4. For simplicity, we show only the results obtained in the supervised mode for downscaling (see Figure 8 and Figure 9) and upscaling (see Figure 10 and Figure 11) using BIC input images, since no appreciable visual difference was present in the corresponding images obtained in the unsupervised mode. All images are shown at the same printing size to provide accurate evidence regarding the visual details. Figure 12 and Figure 13 are examples related to the image dependency, displaying the results in supervised mode at the scale factor of 4 with u-LCI and L0 input images for downscaling and upscaling, respectively.
The visual inspection of these results validated the quantitative evaluation in terms of the quality measures presented in Section 3.1 and Section 3.2. Indeed, the results obtained by VPI and LCI, on average, appeared visually to be more attractive, since (a) the detectable object structure was captured; (b) the input image local contrast and luminance were retained; (c) most of the salient borders and the small details were preserved; and (d) the presence of over-smoothing artifacts and ringing was minimal. Unfortunately, some aliasing effects in downscaling are visible for any benchmark methods, especially when the HR image had high-frequency details. The extent and type of aliasing visual effects depends on many factors and vary according to the local context. Thus, aliasing in downscaling remains an open problem.
Figure 9. Examples of supervised downscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Figure 9. Examples of supervised downscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Remotesensing 15 04039 g009
Figure 10. Examples of supervised upscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Figure 10. Examples of supervised upscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Remotesensing 15 04039 g010
Figure 11. Examplesof supervised upscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Figure 11. Examplesof supervised upscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Remotesensing 15 04039 g011
Figure 12. Target image im30 (top left) with size 1080 × 1920 from GE100-HDTV; tile (top right) with size 200 × 280; qualitative comparison of :4 supervised downscaling performance results with u-LCI image input (bottom).
Figure 12. Target image im30 (top left) with size 1080 × 1920 from GE100-HDTV; tile (top right) with size 200 × 280; qualitative comparison of :4 supervised downscaling performance results with u-LCI image input (bottom).
Remotesensing 15 04039 g012
Figure 13. Target image im40 (top left) with size 1080 × 1920 from GE100-HDTV; tile (top right) with size 272 × 326; qualitative comparison of × 4 supervised upscaling performance results with L0 image input (bottom).
Figure 13. Target image im40 (top left) with size 1080 × 1920 from GE100-HDTV; tile (top right) with size 272 × 326; qualitative comparison of × 4 supervised upscaling performance results with L0 image input (bottom).
Remotesensing 15 04039 g013

3.5. Final Remarks

Overall, the experimental results from the current working hypotheses confirmed the trend already outlined in [20,21]. Indeed, for RSAs, the quality measures and CPU time results demonstrated that, on average, VPI and LCI showed suitable and competitive performances, since their experimental quality values were more stable and generally better than those of the benchmark methods. Furthermore, VPI and LCI had no implementation limitations, were much faster than the methods specializing in only downscaling or upscaling, and demonstrated adequate CPU times for large images and scale factors.

4. Conclusions

The primary aim of this paper was twofold: firstly, to ascertain the notable performance disparities among some IR benchmark methods, and secondly, to assess the visual quality they could achieve in RS image processing. To reach this objective, we realized and utilized an open framework designed to evaluate and compare the performance of the benchmark methods across a suite of six datasets.
The proposed framework was intended to encourage the adoption of the best practices in designing, analyzing, and conducting comprehensive assessments of IR methods. Implemented in a widely popular and user-friendly scientific language, Matlab, it already incorporates a selection of representative IR methods. Furthermore, a distinctive aspect introduced was the framework’s utilization of diverse FR and NR quality assessment measures tailored to whether the evaluation was supervised or unsupervised. In particular, we highlight the novelty of using NRQA measures, which are not typically employed for evaluating IR methods.
The publicly available framework for tuning and evaluating IR methods is a flexible and extensible tool, since other IR methods could be added in the future. This leaves open the possibility of contributing by introducing new benchmark methods. Furthermore, although the framework was primarily conceived for evaluating IR methods in RSAs, it can be used for any type of color image and a large number of different applications.
In this study, the framework yielded a good amount of results, encompassing CPU time, statistical analysis, and visual quality measures for RSAs. This facilitated the establishment of a standardized evaluation methodology, overcoming the limitations associated with conventional approaches commonly used to assess IR methods, often leading to uncorrected and untested valuations. Adhering to this research direction, the statistical and quality evaluations, conducted through multiple comparisons across numerous RS datasets, played a crucial role, since, in this way, the variance stemmed from the dissimilarities among the independent datasets at different image scales.
The performance evaluation was conducted using four datasets selected from the most representative within the RSA field and two new datasets generated to highlight and test several experimental aspects of RS. These additional datasets were made publicly available, serving as a valuable resource for the research community.
Overall, the study successfully achieved its primary objective by providing valuable assistance to researchers in selecting and comparing diverse image scaling methods. We are optimistic that our efforts will have a positive impact on future research, fostering advancements in image scaling and its applications. We hope that our efforts will prove beneficial for future research endeavors, driving advancements in the field of image scaling and its applications. Moving forward, our focus for future research will be on extending the proposed framework to 3D images and validating its applicability for biomedical applications.

Author Contributions

Conceptualization; methodology; software; validation; formal analysis; investigation; resources; data curation; writing (original draft preparation, review, and editing); visualization; supervision: D.O., G.R. and W.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding. The research was accomplished within the Research ITalian network on Approximation (RITA) and Approximation Theory research group of Unione Matematica Italiana (TA–UMI). It was partially supported by GNCS–INdAM and the University of Basilicata (local funds).

Data Availability Statement

The code and the supplementary data are openly available at https://github.com/ImgScaling/IR_framework (accessed on 15 June 2023).

Acknowledgments

The authors would like to thank Luciano De Leo for the IT support.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIDAerial Image Dataset
BICBicubic interpolation
BRISQUEBlind/Referenceless Image Spatial Quality Evaluator
d-LCIDownscaling Lagrange–Chebychev interpolation
DIV2kDIVerse 2K resolution image dataset
DPIDDetail-preserving image downscaling
d-VPIDownscaling de la Valleé–Poussin interpolation
IRImage resizing
FRQAFull-reference quality assessment
GDVDGoogle Earth 100 Images—DVD
GHDTVGoogle Earth 100 Images—HDTV
HVSHuman visual system
L0L0-regularized image downscaling
LCILagrange–Chebychev interpolation
MLMachine learning
MSEMean squared error
NIQENatural Image Quality Evaluator
NRQANo-reference quality assessment
NWVNWPU VHR-10 dataset
PIQEPerception-based Image Quality Evaluator
PSNRPeak signal-to-noise ratio
RSRemote sensing
RSARemote sensing application
SCNSparse-coding-based network
SSIMStructural similarity index measure
UCAUCAS_AOD dataset
u-LCIUpscaling Lagrange–Chebychev interpolation
UCMLUCMerced_LandUse dataset
u-VPIUpscaling de la Vallée–Poussin interpolation
VHRVery high resolution
VPIde la Vallée–Poussin interpolation

References

  1. Richards, J.A. Sources and Characteristics of Remote Sensing Image Data. In Remote Sensing Digital Image Analysis; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
  2. Bai, B.; Tan, Y.; Donchyts, G.; Haag, A.; Xu, B.; Chen, G.; Weerts, A.H. Naive Bayes classification-based surface water gap-filling from partially contaminated optical remote sensing image. J. Hydrol. 2023, 616, 128791. [Google Scholar] [CrossRef]
  3. Massi, A.; Ortolani, M.; Vitulano, D.; Bruni, V.; Mazzanti, P. Enhancing the Thermal Images of the Upper Scarp of the Poggio Baldi Landslide (Italy) by Physical Modeling and Image Analysis. Remote Sens. 2023, 15, 907. [Google Scholar] [CrossRef]
  4. Zhang, T.; Zeng, T.; Zhang, X. Synthetic Aperture Radar (SAR) Meets Deep Learning. Remote Sens. 2023, 15, 303. [Google Scholar] [CrossRef]
  5. Goodchild, M.F.; Quattrochi, D.A. Introduction: Scale, Multiscaling, Remote Sensing, and GIS. In Scale in Remote Sensing and GIS; Quattrochi, D.A., Goodchild, M.F., Eds.; Lewis Publishers: Boca Raton, FL, USA, 1997; pp. 1–12. [Google Scholar]
  6. Marceau, D.J.; Hay, G.J. Remote sensing contributions to the scale issue. Can. J. Remote Sens. 1999, 25, 357–366. [Google Scholar] [CrossRef]
  7. Cao, C.Y.; Lam, N. Understanding the scale and resolution effects in remote sensing and GIS. In Scale in Remote Sensing and GIS; Quattrochi, D.A., Goodchild, M.F., Eds.; Lewis Publishers: Boca Raton, FL, USA, 1997; pp. 57–72. [Google Scholar]
  8. Wu, H.; Li, Z.-L. Scale Issues in Remote Sensing: A Review on Analysis, Processing and Modeling. Sensors 2009, 9, 1768–1793. [Google Scholar]
  9. Chen, G.; Zhao, H.; Pang, C.K.; Li, T.; Pang, C. Image Scaling: How Hard Can It Be? IEEE Access 2019, 7, 129452–129465. [Google Scholar]
  10. Arcelli, C.; Brancati, N.; Frucci, M.; Ramella, G.; di Baja, G.S. A fully automatic one-scan adaptive zooming algorithm for color images. Signal Process. 2011, 91, 61–71. [Google Scholar] [CrossRef]
  11. Lin, X.; Ma, Y.-l.; Ma, L.-z.; Zhang, R.-l. A survey for image resizing. J. Zhejiang Univ. Sci. C 2014, 15, 697–716. [Google Scholar] [CrossRef]
  12. Ghosh, S.; Garai, A. Image downscaling via co-occurrence learning. J. Vis. Commun. Image Represent. 2023, 91, 103766. [Google Scholar] [CrossRef]
  13. Hach, T.; Knob, S. A Magnifier on Accurate Depth Jumps. In Proceedings of the IS&T Int’l. Symp. on Electronic Imaging: 3D Image Processing, Measurement (3DIPM), and Applications, Burlingame, CA, USA, 29 January–2 February 2017; pp. 15–26. [Google Scholar] [CrossRef]
  14. Lee, C.-C.; So, E.C.; Saidy, L.; Wang, M.-J. Lung Field Segmentation in Chest X–ray Images Using Superpixel Resizing and Encoder–Decoder Segmentation Networks. Bioengineering 2022, 9, 351. [Google Scholar] [CrossRef]
  15. Weissleder, R. Scaling down imaging: Molecular mapping of cancer in mice. Nat. Rev. Cancer 2002, 2, 11–18. [Google Scholar] [CrossRef] [PubMed]
  16. Liu, H.; Xie, X.; Ma, W.Y.; Zhang, H.J. Automatic Browsing of Large Pictures on Mobile Devices. In Proceedings of the 11th ACM International Conference on Multimedia, Berkeley, CA, USA, 2–8 November 2003. [Google Scholar]
  17. Zhang, M.; Zhang, L.; Sun, Y.; Feng, L.; Ma, W. Auto cropping for digital photographs. In Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands, 6–8 July 2005; p. 4. [Google Scholar]
  18. Chen, H.; Lu, M.; Ma, Z.; Zhang, X.; Xu, Y.; Shen, Q.; Zhang, W. Learned Resolution Scaling Powered Gaming-as-a-Service at Scale. IEEE Trans. Multimed. 2021, 23, 584–596. [Google Scholar] [CrossRef]
  19. Pratt, W.K. Digital Image Processing; John Wiley & Sons: New York, NY, USA, 2001. [Google Scholar]
  20. Occorsio, D.; Ramella, G.; Themistoclakis, W. Lagrange-Chebyshev Interpolation for image resizing. Math. Comput. Simul. 2022, 197, 105–126. [Google Scholar] [CrossRef]
  21. Occorsio, D.; Ramella, G.; Themistoclakis, W. Image Scaling by de la Vallée-Poussin Filtered Interpolation. J. Math. Imaging Vis. 2023, 65, 513–541. [Google Scholar] [CrossRef]
  22. Han, D. Comparison of Commonly Used Image Interpolation Methods. In Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering, Hong Kong, China, 17–18 June 2013; pp. 1556–1559. [Google Scholar]
  23. Madhukar, B.N.; Narendra, R. Lanczos Resampling for the Digital Processing of Remotely Sensed Images. In Proceedings of International Conference on VLSI, Communication, Advanced Devices, Signals & Systems and Networking (VCASAN-2013); Chakravarthi, V., Shirur, Y., Prasad, R., Eds.; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2013; Volume 258, pp. 403–411. [Google Scholar]
  24. Zhou, D.-X. Theory of deep convolutional neural networks: Downsampling. Neural Netw. 2020, 124, 319–327. [Google Scholar]
  25. Hayat, K. Multimedia super-resolution via deep learning: A survey. Digit. Signalprocess. 2018, 81, 198–217. [Google Scholar]
  26. Ran, Q.; Xu, X.; Zhao, S.; Li, W.; Du, Q. Remote sensing images super-resolution with deep convolution networks. Multimed. Tools Appl. 2020, 79, 8985–9001. [Google Scholar] [CrossRef]
  27. Chen, L.; Liu, H.; Yang, M.; Qian, Y.; Xiao, Z.; Zhong, X. Remote Sensing Image Super-Resolution via Residual Aggregation and Split Attentional Fusion Network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 9546–9556. [Google Scholar] [CrossRef]
  28. Dong, X.; Sun, X.; Jia, X.; Xi, Z.; Gao, L.; Zhang, B. Remote Sensing Image Super-Resolution Using Novel Dense-Sampling Networks. IEEE Trans. Geosci. Remote Sens. 2021, 59, 1618–1633. [Google Scholar] [CrossRef]
  29. Liu, B.; Zhao, L.; Li, J.; Zhao, H.; Liu, W.; Li, Y.; Wang, Y.; Chen, H.; Cao, W. Saliency-Guided Remote Sensing Image Super-Resolution. Remote Sens. 2021, 13, 5144. [Google Scholar] [CrossRef]
  30. Wang, X.; Yi, J.; Guo, J.; Song, Y.; Lyu, J.; Xu, J.; Yan, W.; Zhao, J.; Cai, Q.; Min, H. A Review of Image Super-Resolution Approaches Based on Deep Learning and Applications in Remote Sensing. Remote Sens. 2022, 14, 5423. [Google Scholar] [CrossRef]
  31. Cheng, R.; Wang, H.; Luo, P. Remote sensing image super-resolution using multi-scale convolutional sparse coding network. PLoS ONE 2022, 17, e0276648. [Google Scholar] [CrossRef] [PubMed]
  32. Atkinson, P.M. Downscaling in remote sensing. Int. J. Appl. Earth Obs. Geoinf. 2013, 22, 106–114. [Google Scholar]
  33. Wonsook; Ha, W.; Gowda, P.H.; Howell, T.A. A review of downscaling methods for remote sensing-based irrigation management: Part I. Irrig. Sci. 2013, 31, 831–850. [Google Scholar]
  34. Zhou, J.; Liu, S.; Li, M.; Zhan, W.; Xu, Z.; Xu, T. Quantification of the Scale Effect in Downscaling Remotely Sensed Land Surface Temperature. Remote Sens. 2016, 8, 975. [Google Scholar] [CrossRef] [Green Version]
  35. Peng, J.; Loew, A.; Merlin, O.; Verhoest, N.E.C. A review of spatial downscaling of satellite remotely sensed soil moisture. Rev. Geophys. 2017, 55, 341–366. [Google Scholar] [CrossRef] [Green Version]
  36. Keys, R.G. Cubic Convolution Interpolation for Digital Image Processing. IEEE Trans. Acoust. Speech Signal Process. 1981, 29, 1153–1160. [Google Scholar] [CrossRef] [Green Version]
  37. Weber, N.; Waechter, M.; Amend, S.C.; Guthe, S.; Goesele, M. Rapid, Detail-Preserving Image Downscaling. ACM Trans. Graph. 2016, 35, 205. [Google Scholar] [CrossRef] [Green Version]
  38. Liu, J.; He, S.; Lau, R.W.H. L0 Regularized Image Downscaling. IEEE Trans. Image Process. 2018, 27, 1076–1085. [Google Scholar] [CrossRef]
  39. Wang, Z.; Liu, D.; Yang, J.; Han, W.; Huang, T. Deep networks for image super-resolution with sparse prior. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015. [Google Scholar]
  40. AID (Aerial Image Dataset). Available online: https://www.kaggle.com/datasets/jiayuanchengala/aid-scene-classification-datasetsandhttps://captain-whu.github.io/AID/ (accessed on 1 January 2023).
  41. NWPU VHR-10 Dataset. Available online: https://gcheng-nwpu.github.io/ (accessed on 1 January 2023).
  42. UCAS_AOD Dataset. Available online: https://opendatalab.com/102 (accessed on 1 January 2023).
  43. UCMerced_LandUse Dataset. Available online: http://weegee.vision.ucmerced.edu/datasets/landuse.html (accessed on 1 January 2023).
  44. Occorsio, D.; Themistoclakis, W. Uniform weighted approximation on the square by polynomial interpolation at Chebyshev nodes. Appl. Math. Comput. 2020, 385, 125457. [Google Scholar]
  45. Ramella, G.; di Baja, G.S. Color quantization by multiresolution analysis. In Computer Analysis of Images and Patterns; Jiang, X., Petkov, N., Eds.; Lecture Notes in Computer Science 5702; Springer: Berlin/Heidelberg, Germany, 2009; pp. 525–532. [Google Scholar]
  46. Ramella, G.; di Baja, G.S. Multiresolution histogram analysis for color reduction. In Progress in Pattern Recognition, Image Analysis, Computer Vision and Applications; Bloch, I., Cesar, R.M., Jr., Eds.; Lecture Notes in Computer Science 6419; Springer: Berlin/Heidelberg, Germany, 2010; pp. 22–29. [Google Scholar]
  47. Ramella, G.; di Baja, G.S. A new technique for color quantization based on histogram analysis and clustering. Int. J. Patt. Recog. Artif. Intell. 2013, 27, 1–17. [Google Scholar]
  48. Ramella, G.; di Baja, G.S. Color quantization via spatial resolution reduction. In VISAPP 2013; Battiato, S., Braz, J., Eds.; Scitepress Science and Technology Publications: Montreal, QC, Canada, 2013; pp. 78–83. ISBN 9789898565471. [Google Scholar]
  49. Bruni, V.; Ramella, G.; Vitulano, D. Automatic Perceptual Color Quantization of Dermoscopic Images. In VISAPP 2015; Braz, J., Battiato, S., Imai, F., Eds.; Scitepress Science and Technology Publications: Montreal, QC, Canada, 2015; Volume 1, pp. 323–330. [Google Scholar]
  50. Ramella, G.; di Baja, G.S. A new method for color quantization. In Proceedings of the 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, 28 November–1 December 2016; pp. 1–6. [Google Scholar]
  51. Bruni, V.; Ramella, G.; Vitulano, D. Perceptual-based Color Quantization. In Image Analysis and Processing—ICIAP 2017; Battiato, S., Gallo, G., Schettini, R., Stanco, F., Eds.; Lecture Notes in Computer Science 10484; Springer: Berlin/Heidelberg, Germany, 2017; pp. 671–681. [Google Scholar]
  52. Ramella, G.; di Baja, G.S. Color histogram-based image segmentation. In Computer Analysis of Images and Patterns–CAIP 2011; Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W., Eds.; Lecture Notes in Computer Science 6854; Springer: Berlin/Heidelberg, Germany, 2011; Volume I, pp. 76–83. [Google Scholar]
  53. Ramella, G.; di Baja, G.S. Image segmentation based on representative colors and region merging. In Pattern Recognition; Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Rodríguez, J.S., di Baja, G.S., Eds.; 611 Lecture Notes in Computer Science 7914; Springer: Berlin/Heidelberg, Germany, 2013; pp. 175–184. [Google Scholar]
  54. Ramella, G.; di Baja, G.S. From color quantization to image segmentation. In Proceedings of the 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, 28 November–1 December 2016; pp. 798–804. [Google Scholar]
  55. Mittal, H.; Pandey, A.C.; Saraswat, M.; Kumar, S.; Pal, R.; Modvel, G. A comprehensive survey of image segmentation: Clustering methods, performance parameters, and benchmark datasets. Multimed. Tools Appl. 2022, 81, 35001–35026. [Google Scholar] [CrossRef]
  56. DIV2K Dataset. Available online: https://data.vision.ee.ethz.ch/cvl/DIV2K/ (accessed on 1 January 2023).
  57. Chaki, J.; Dey, N. Introduction to Image Color Feature. In Image Color Feature Extraction Techniques; Springer Briefs in Applied Sciences and Technology; Springer: Singapore, 2021. [Google Scholar]
  58. Wohker, C. 3D Computer Vision: Efficient Methods and Applications; X.media.publising Series; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  59. PSNR Definition. Available online: https://it.mathworks.com/help/vision/ref/psnr.html (accessed on 1 January 2023).
  60. Ramella, G. Evaluation of quality measures for color quantization. Multimed. Tools Appl. 2021, 80, 32975–33009. [Google Scholar] [CrossRef]
  61. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Imag. Proc. 2004, 13, 600–612. [Google Scholar]
  62. Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a Completely Blind Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 22, 209–212. [Google Scholar] [CrossRef]
  63. Mittal, A.; Moorthy, A.K.; Bovik, A.C. Referenceless Image Spatial Quality Evaluation Engine. In Proceedings of the 45th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 6–9 November 2011. [Google Scholar]
  64. Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-Reference Image Quality Assessment in the Spatial Domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]
  65. Sheikh, H.R.; Wang, Z.; Cormack, L.; Bovik, A.C. LIVE Image Quality Assessment Database Release 2. Image & Video Quality Assessment at LIVE. Available online: https://live.ece.utexas.edu/research/quality/ (accessed on 3 February 2023).
  66. Venkatanath, N.; Praneeth, D.; Bh, M.C.; Channappayya, S.S.; Medasani, S.S. Blind Image Quality Evaluation Using Perception Based Features. In Proceedings of the 21st National Conference on Communications (NCC), Mumbai, India, 27 February–1 March 2015; IEEE: Piscataway, NJ, USA, 2015. [Google Scholar]
  67. Xia, G.-S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L. AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [Google Scholar]
  68. Cheng, G.; Han, J.; Zhou, P.; Guo, L. Multi-class geospatial object detection and geographic image classification based on collection of part detectors. Isprs J. Photogramm. Remote Sens. 2014, 98, 119–132. [Google Scholar]
  69. Cheng, G.; Han, J. A survey on object detection in optical remote sensing images. Isprs J. Photogramm. Remote Sens. 2016, 117, 11–28. [Google Scholar]
  70. Cheng, G.; Zhou, P.; Han, J. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 7405–7415. [Google Scholar] [CrossRef]
  71. Haigang, Z.; Xiaogang, C.; Weiqun, D.; Kun, F.; Qixiang, Y.; Jianbin, J. Orientation robust object detection in aerial images using deep convolutional neural network. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec, QC, Canada, 27–30 September 2015; pp. 3735–3739. [Google Scholar]
  72. Yang, Y.; Newsam, S. Bag-Of-Visual-Words and Spatial Extensions for Land-Use Classification. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS), San Jose, CA, USA, 2–5 November 2010. [Google Scholar]
Figure 1. Dialog boxes employed in the framework.
Figure 1. Dialog boxes employed in the framework.
Remotesensing 15 04039 g001
Figure 6. Boxplots derived from Table 3 and Table 5.
Figure 6. Boxplots derived from Table 3 and Table 5.
Remotesensing 15 04039 g006
Figure 7. Boxplots derived from Table 13 and Table 15.
Figure 7. Boxplots derived from Table 13 and Table 15.
Remotesensing 15 04039 g007
Figure 8. Examples of supervised downscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Figure 8. Examples of supervised downscaling performance results using BIC input images at the scale factors of 2 (left), 3 (middle), and 4 (right).
Remotesensing 15 04039 g008
Table 1. Benchmark methods.
Table 1. Benchmark methods.
MethodTypeSelectionFeatures
BICDown/upScale/sizeBicubic interpolation
DPIDDownSizeHigher convolutional filter weights assigned to pixels differing from their neighborhoods
L0DownScaleOptimization framework based on two priors iteratively applied
LCIDown/upScale/sizeLagrange interpolation at Chebyshev zeros
VPIDown/upScale/sizeFiltered VP interpolation at Chebyshev zeros
SCNUpScaleCascade of SCNs trained for scaling factors
Table 2. Datasets (# is the short way to indicate ”number of”).
Table 2. Datasets (# is the short way to indicate ”number of”).
Dataset# Total Images# CategoriesImage FormatImage Size
AID_cat134030jpg600 × 600
GDVD1001jpg852 × 480
GHDTV1001jpg1920 × 1080
NWV8002jpgFrom 381 × 601 to 939 × 1356
UCA24103png1280 × 659
UCML210021tif256 × 256
Table 11. Average performance of supervised downscaling methods with input images from GDVD for the ground truth from GHDTV—scale factor :2.253.
Table 11. Average performance of supervised downscaling methods with input images from GDVD for the ground truth from GHDTV—scale factor :2.253.
MSEPSNRSSIMTIME
BIC121.11627.8600.9160.013
DPID152.85126.8140.89734.260
L0309.38823.6730.8175.028
d-LCI175.54626.2130.8800.114
d-VPI139.92827.2080.9002.143
Table 12. Average performance of supervised upscaling methods with input images from GDVD for the ground truth from GHDTV—scale factor x2.253 (n.a. indicates “not available”).
Table 12. Average performance of supervised upscaling methods with input images from GDVD for the ground truth from GHDTV—scale factor x2.253 (n.a. indicates “not available”).
MSEPSNRSSIMTIME
BIC188.97425.9020.8370.015
SCNn.a.n.a.n.a.n.a.
u-LCI188.30425.9160.8350.134
u-VPI184.09426.0010.8402.953
Table 17. Average performance of unsupervised downscaling methods for the five considered categories common to AID_cat and UCA.
Table 17. Average performance of unsupervised downscaling methods for the five considered categories common to AID_cat and UCA.
:2:3:4
NIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIME
Beach
BIC7.16522.79123.2230.0037.82521.47831.6440.00318.87623.48831.2950.003
DPID8.75430.70226.3994.8669.85927.44736.7613.49618.87727.91036.4802.971
L08.23624.36128.3890.6289.65424.56336.4330.61218.87625.25636.4270.630
d-LCI8.14623.62424.4060.01710.22625.15436.9220.01118.87827.18936.6080.010
d-VPI8.07623.98224.6650.02010.23325.28237.2130.01218.87726.70236.0970.011
Forest
BIC10.43839.92227.8450.00411.02239.67344.9323.28618.88539.14245.8810.003
DPID22.07843.42144.1074.51325.50943.50860.8043.28618.87443.60064.0242.763
L011.78241.84847.8520.55814.18741.72359.8670.56218.88641.73859.1110.562
d-LCI11.28040.36641.5760.01615.73642.46761.4990.01018.88543.43564.4930.009
d-VPI11.19540.20040.0090.01815.93242.41761.4420.01218.88543.31863.9810.011
Parking
BIC8.27330.64826.5900.0039.20426.93537.0710.00318.88027.68337.1960.003
DPID11.31442.51432.2524.62912.34339.27644.8493.31318.88038.19247.3112.788
L08.75633.34032.6060.62811.01534.61545.0770.62718.88131.67041.8200.628
d-LCI8.37130.63128.2880.01810.91231.04143.5010.01118.88133.25545.2010.010
d-VPI8.27929.71928.4210.02110.90930.67243.8150.01318.88133.20445.0600.011
Residential
BIC7.89826.81726.9100.0038.34926.00136.6190.00318.88227.13033.0550.003
DPID12.44643.42445.1964.39814.98242.62754.7253.24118.88042.06953.4182.792
L010.12535.46548.6270.59612.70934.97554.4460.59518.88333.17945.9210.596
d-LCI9.81132.38643.0130.01713.87036.62556.0650.01118.88538.16355.7380.009
d-VPI9.56431.35940.7800.02013.90536.26655.9720.01218.88437.48754.1610.011
Table 18. Average performance of unsupervised upscaling methods for the five considered categories common to AID_cat and UCA.
Table 18. Average performance of unsupervised upscaling methods for the five considered categories common to AID_cat and UCA.
×2×3×4
NIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIMENIQEBRISQUEPIQETIME
Beach
BIC4.94443.39256.6620.0185.77151.79783.4630.0315.91752.85992.1550.045
SCN4.44738.71142.0885.1845.16646.36365.89425.6545.59150.04977.29225.360
u-LCI4.90542.97246.5460.0865.65451.58374.7590.1486.38054.21186.5450.205
u-VPI4.90445.26554.9250.1125.77652.95981.5510.2046.16854.33291.8730.327
Forest
BIC6.69646.77058.7470.0167.35354.82083.6160.0296.80457.63891.1290.041
SCN6.55244.61440.7864.9036.85946.74867.21024.3356.60048.84077.88623.469
u-LCI7.21546.95248.1070.0827.13153.58776.9660.1357.55057.41787.0540.203
u-VPI6.79546.94255.6030.1087.25754.57881.8990.1877.13357.93290.8190.303
Parking
BIC5.09147.56269.9040.0185.78856.71885.2800.0315.69458.78789.5700.045
SCN4.56242.32155.4895.2975.06747.16473.58625.3905.25050.12979.85225.339
u-LCI5.20745.65958.0650.0895.50754.32078.9660.1426.02058.24585.8030.211
u-VPI5.16548.18567.4380.1165.74256.57884.1450.2085.94659.33888.9240.327
Residential
BIC5.66343.23250.5950.0176.87552.80779.5370.0296.24256.83387.6260.042
SCN5.36039.83230.2135.0626.04942.41458.06224.4746.04145.91972.64324.785
u-LCI6.21844.09135.1700.0876.42250.43070.1300.1426.89556.18181.8220.212
u-VPI5.70143.68246.5370.1136.62852.43977.4500.1976.46357.22687.1410.332
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Occorsio, D.; Ramella, G.; Themistoclakis, W. An Open Image Resizing Framework for Remote Sensing Applications and Beyond. Remote Sens. 2023, 15, 4039. https://doi.org/10.3390/rs15164039

AMA Style

Occorsio D, Ramella G, Themistoclakis W. An Open Image Resizing Framework for Remote Sensing Applications and Beyond. Remote Sensing. 2023; 15(16):4039. https://doi.org/10.3390/rs15164039

Chicago/Turabian Style

Occorsio, Donatella, Giuliana Ramella, and Woula Themistoclakis. 2023. "An Open Image Resizing Framework for Remote Sensing Applications and Beyond" Remote Sensing 15, no. 16: 4039. https://doi.org/10.3390/rs15164039

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop