Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution

Lukač, Niko; Mongus, Domen; Bizjak, Marko

doi:10.3390/rs17081366

Open AccessArticle

Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution

by

Niko Lukač

^*

,

Domen Mongus

and

Marko Bizjak

Faculty of Electrical Engineering and Computer Science, University of Maribor, 2000 Maribor, Slovenia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(8), 1366; https://doi.org/10.3390/rs17081366

Submission received: 28 February 2025 / Revised: 9 April 2025 / Accepted: 10 April 2025 / Published: 11 April 2025

(This article belongs to the Section AI Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, a novel method for estimating high-resolution isotropic broadband albedo is proposed, by downscaling satellite-derived albedo using an optimization approach. At first, broadband albedo is calculated from the lower-resolution multispectral satellite image using standard narrow-to-broadband (NTB) conversion, where the surfaces are considered Lambertian with isotropic reflectance. The high-resolution true orthophoto for the same location is segmented with the deep learning-based Segment Anything Model (SAM), and the resulting segments are refined with a classified digital surface model (cDSM) to exclude small transient objects. Afterwards, the remaining segments are grouped using K-means clustering, by considering orthophoto-visible (VIS) and near-infrared (NIR) bands. These segments present surfaces with similar materials and underlying reflectance properties. Next, the Differential Evolution (DE) optimization algorithm is applied to approximate albedo values to these segments so that their spatial aggregate matches the coarse-resolution satellite albedo, by proposing two novel objective functions. Extensive experiments considering different DE parameters over an 0.75 km² large urban area in Maribor, Slovenia, have been carried out, where Sentinel-2 Level-2A NTB-derived albedo was downscaled to 1 m spatial resolution. Looking at the performed spatiospectral analysis, the proposed method achieved absolute differences of 0.09 per VIS band and below 0.18 per NIR band, in comparison to lower-resolution NTB-derived albedo. Moreover, the proposed method achieved a root mean square error (RMSE) of 0.0179 and a mean absolute percentage error (MAPE) of 4.0299% against ground truth broadband albedo annotations of characteristic materials in the given urban area. The proposed method outperformed the Enhanced Super-Resolution Generative Adversarial Networks (ESRGANs), which achieved an RMSE of 0.0285 and an MAPE of 9.2778%, and the Blind Super-Resolution Generative Adversarial Network (BSRGAN), which achieved an RMSE of 0.0341 and an MAPE of 12.3104%.

Keywords:

isotropic broadband albedo; high-resolution albedo; Sentinel-2 albedo; true orthophoto; Segment Anything Model; Differential Evolution

Graphical Abstract

1. Introduction

In recent years, Earth observation (EO) data volumes have soared, since various EO satellites now capture the Earth surface on a daily basis, using various spectral bands [1]. Publicly available sources, such as Sentinel-2 and Landsat-9, provide a steady stream of free multispectral imagery. Aerial imaging delivers even finer spatial scales, mostly in visible (VIS) and near-infrared (NIR) bands. These heterogeneous data sources, each with unique resolutions, spectral bands, and environmental parameters at acquisition time, now overlap in many regions. By carefully fusing them, it is possible to model specific surface properties with higher accuracy. One such property is solar albedo, the measure of a surface’s ability to reflect incoming solar radiation. It greatly depends on the composition of the material, as well as the moisture content and the roughness of the surface [2].

The albedo is represented as a dimensionless number in [0, 1] [3,4], where snow, for example, reflects most incident light, resulting in an albedo near 1, while asphalt absorbs much more, with an albedo near 0. Albedo can be precisely measured using specialized instruments [5] for field-specific tasks, or it can be estimated over large geographical areas from the spectral data available in EO imagery. Narrowband albedo estimates focus on a few specific wavelengths such as the VIS bands [6], while broadband approaches integrate across a wide range of the spectrum. The estimation of albedo from satellite imagery has seen significant advancements through the development of various methods, where narrow-to-broadband conversion (NTB) remains a fundamental approach. In here, narrowband reflectances are converted into broadband albedo using empirically derived coefficients. This has already been extensively applied to various satellite products, with Liang’s framework being a cornerstone in its development [7]. Alongside NTB, models based on the bidirectional reflectance distribution function (BRDF) can be used to account for surface anisotropy and multi-angular reflectance effects [8].

1.1. The State of the Art

In recent years, various methods have been developed to estimate albedo at higher spatial resolutions. Li et al. [9] have coupled Sentinel-2 imagery with surface anisotropy information from 500 m resolution Moderate-Resolution Imaging Spectroradiometer (MODIS) BRDF data to further enhance the resolution of the broadband albedo to 20 m. Shuai et al. integrated Landsat reflectance with MODIS BRDF parameters to produce albedo estimates at 30 m [10]. Bonafoni and Sekertekin [11] developed NTB coefficients for Sentinel-2 multispectral data, achieving isotropic broadband albedo estimates of 10 m for Lambertian surfaces. Although the method by Bonafoni and Sekertekin [11] was originally developed to estimate albedo in rural areas, it was successfully applied in various urban areas, namely for the analysis of thermal diurnal hot spots [12], the impact of albedo on urban heat islands (UHIs) [13], and the mapping of the effects of heat waves [14]. Lin et al. [15] introduced a direct albedo estimation approach using Google Earth Engine to fuse MODIS BRDF data with Sentinel-2 surface reflectance, assisted by the 10 m ESA WorldCover classification [16]. Their method, validated against both 3D DART simulations and tower-based flux measurements, enables 10 m anisotropic albedo retrieval across diverse land cover types. Similarly, Chen et al. [17] combined Sentinel-2 data, ESA WorldCover, and MODIS BRDF data to train and validate four data-driven models, thus achieving 10 m resolution anisotropic albedo retrievals. Various 10 m albedo estimation methods rely on land cover classification as a key input to constrain the reflectance-to-albedo relationships, yet this can introduce errors when the classes are overly coarse or misclassified. For example, the built-up category in ESA WorldCover and similar land cover products can typically represent a broad mix of surface materials, from dark asphalt streets to lots to bright reflective roofs, which can have significantly different reflectances. Alternatively, one could use Unmanned Aerial Vehicles’ (UAVs) multispectral imagery to estimate sub-10 m albedo based on NTB conversion [18,19].

Another way to estimate higher-resolution albedo is via geospatial downscaling, which is a process of increasing the spatial resolution of coarse-scale data by estimating finer-scale features, often using statistical, optimization, or machine learning methods. Buster et al. [20] proposed a physics-based method to downscale solar resource data that includes albedo, from the National Solar Radiation Database (NSRDB), originally at a spatial resolution of 4 km, to 2 km grids. This approach incorporates spatial interpolation (elevation correction and inverse distance weighting) and temporal interpolation to refine atmospheric variables. Karalasingham et al. [21] introduced the Deep Downscaling Spectral Model with Attention in order to improve albedo resolution using PROBA-V/SPOT satellite data. This model can downscale 1 km albedo imagery to 500, 333, and 250 m resolutions and was validated with ground observations. Chen et al. [22] developed a linear downscaling approach to refine the data for the MODIS ecosystem function (500 m) to match Landsat land cover maps (30 m).

1.2. Motivation

Although state-of-the-art satellite-derived broadband albedo provides a good estimate for large-scale climatological studies [23] or global-scale applications [24], it may not be a sufficient input for high-resolution sub-5 m physical simulations and modeling of physical processes based on remote sensing data, such as solar irradiance, heat transfer, and land surface temperature. Hence, state-of-the-art methods completely disregard albedo or assume isotropic albedo for all surfaces. Various solar irradiance modeling methods use a simplified Liu–Jordan reflectance model [25], e.g., [26,27,28,29], where only horizontal ground reflective irradiance is considered to have a general constant isotropic albedo of 0.2 [30]. Jakubiec et al. [31] performed ray tracing for reflective irradiance; however, they considered a constant albedo of 0.35 for buildings and 0.2 for ground. Reflective irradiance was estimated by Lukač et al. [32] using a ray-casting approach, considering isotropic albedo values for ground, rooftops, and facades based on known location-specific materials. Similarly, Bizjak et al. [33] estimated the contribution of long-wave radiation in the urban environment to heat transfer simulation, where an isotropic albedo of 0.2 was considered for ground surfaces. Hofierka et al. [34,35] made high-resolution estimates of land surface temperature using physical simulation, where isotropic albedo was derived at a resolution of 10 m via Sentinel-2 NTB and interpolated to a higher resolution. In practice, these simplified or constant albedo assumptions can lead to inaccuracies in modeling reflecting solar irradiance or surface heat exchange, especially in dense urban environments where individual surfaces’ materials can differ substantially in their reflective properties and can exhibit albedos ranging from approximately 0.05 (for fresh asphalt) [36] up to 0.7 or higher (for reflective clay roof tiles or metal surfaces) [37]. Although these simplifications are sufficient for large-scale analyses, they are less suitable for detailed urban microclimate modeling or high-fidelity energy simulations.

In order to improve the accuracy of the aforementioned physical simulations, an open research challenge remains to obtain sub-10 m anisotropic albedo for a specific geographic area automatically. On one hand, simply downscaling 500 m MODIS BRDF data to 1 m resolution would yield too many deviations, while downscaling 10 m satellite-derived NTB isotropic albedo to 1 m would not capture anisotropic reflectance effects. Moreover, the availability of publicly available 1 m multispectral satellite imagery and matching high-resolution land cover classification is limited, so existing methods [15,17] that would exploit such data to estimate 1 m albedo are not directly applicable. Fortunately, most of the built materials (e.g., asphalt, concrete, as well as roofing materials such as asphalt shingles and unglazed clays) in urban environments reflect light approximately diffusely and can hence be classified as near-Lambertian surfaces [6]. In this case, the angular dependence of reflectance may be minimal, and an isotropic surface albedo estimation is sufficient, which would vastly improve the aforementioned physical simulations.

1.3. Proposed Novelties

In this paper, this challenge is addressed by introducing a novel method to estimate high-resolution (at 1 m) isotropic albedo by downscaling isotropic broadband albedo estimated from lower-resolution multispectral satellite imagery (e.g., Sentinel-2) based on NTB conversion. This work proposes the following novelties.

A novel workflow is introduced that integrates the deep learning–based Segment Anything Model (SAM) with K-means clustering to produce spectrally consistent segments in the high-resolution true orthophoto, reducing complexity for optimization-driven downscaling. Furthermore, SAM results are refined with a classified Digital Surface Model (cDSM) in order to filter out transient objects (e.g., cars). The albedo is then estimated on a per segment instead of per pixel granularity.
The single-objective DE algorithm is used to downscale segment-level albedo values, approximating 1 m isotropic broadband albedo estimations from coarse satellite data, even when few bands are available in the high-resolution true orthophoto. For this purpose, two new objective functions are introduced.
Using sensitivity analysis, multiple DE optimization strategies and parameter settings (mutation, crossover, number of segments, and the considered objective function) are analyzed to quantify their effects on convergence and final accuracy.
The presented method was validated on an 0.75 km² area, with comparison against the isotropic NTB-based albedo [11] derived from Sentinel-2 Level-2A [38] of 10 m, and a further per-band spatiospectral analysis was performed. Lastly, comparison with manually annotated 1 m ground truth reference data for the given 0.75 km² area was performed and compared to two super-resolution deep learning models.

1.4. Paper Structure

The remainder of the paper is structured as follows. The next section provides a detailed description of the proposed methodology, while the third section provides extensive results based on sensitivity analysis and ground truth-based validation. The fourth section focuses on the discussion, while the last section concludes this paper.

2. Methodology

The general workflow of the proposed method is illustrated in Figure 1. Initially, isotropic broadband albedo is estimated using an NTB approach applied to a satellite image, which is acquired approximately near the same date and time as a high-resolution true orthophoto of the same area. A true orthophoto is an aerial/satellite image that has been geometrically corrected to eliminate perspective distortions, ensuring that all objects are seen orthographically without tilt or displacement [39]. Next, the true orthophoto is segmented using SAM, the deep learning-based image segmentation model [40]. The segmentation is further refined using cDSM to remove transient and irrelevant segments and is quantized into a finite number of segments with K-means clustering, based on the optical properties of pixels and other available bands (such as NIR). Each final segment obtained represents pixels of surfaces with similar underlying materials.

Within the core of the proposed method, an optimization is performed using the DE algorithm [41]. For each segment, an albedo value is estimated. The objective is to determine albedo values such that when the high-resolution segmented image (with corresponding albedos) is downsampled to low resolution, it closely matches the calculated broadband albedo from satellite imagery. This process effectively performs downscaling, improving with each iteration (generation) of the optimization. Over successive generations, high-performing solutions are retained and further improved, leading to stable convergence toward an optimal set of segment-specific albedo values. This iterative approach not only considers spatial heterogeneity across the scene but also robustly handles variations in reflectance due to diverse surface materials and illumination conditions in the true orthophoto. The detailed description, from the proposed data preprocessing approach to the optimization-based albedo estimation, is provided in the following subsections.

2.1. Data Preprocessing

2.1.1. A: Satellite-Derived Isotropic Broadband Albedo

First, a cloud-free and snow-free satellite image is selected (e.g., from Sentinel-2) for which the isotropic broadband albedo is estimated. It is important that the satellite image and orthophoto are captured under near-concurrent temporal conditions, with a minimal temporal offset (i.e., within a few days’ shift and at a similar time of day). This alignment of illuminated surface topography minimizes discrepancies caused by temporal variations such as shadow shifts, recent road or building construction, vegetation changes, local atmospheric effects, etc. This temporal matching can be effectively achieved due to the frequent re-visits of modern EO satellites like Landsat-7/8/9 and Sentinel-2.

For the selected image, a standard NTB conversion algorithm is applied. For example, Sentinel-2 Level-2A clear-sky isotropic broadband albedo

α_{i}

for the i-th pixel can be estimated using the method by Bonafoni and Sekertekin: [11]:

α_{i} = y_{i}^{2} \times 0.2266 + y_{i}^{3} \times 0.1236 + y_{i}^{4} \times 0.1573 + y_{i}^{8} \times 0.3417 + y_{i}^{11} \times 0.1170 + y_{i}^{12} \times 0.0338,

(1)

where

y_{i}^{b}

denotes the value for the b-th band in the i-th pixel. In the above approach, bands

b_{2}

,

b_{3}

,

b_{4}

, and

b_{8}

represent the three VIS and NIR ranges, while bands

b_{11}

and

b_{12}

represent the shortwave infrared range (SWIR). The weighting coefficients determine the amount of the portion of the ground solar irradiance over the spectral range of a given band and were determined and validated by Bonafoni and Sekertekin [11]. This approach considers the Sentinel-2 Level-2A BoA (Bottom of Atmosphere) surface reflectance image as input, where the effect of the solar angle of incidence is minimized [42].

2.1.2. B: True Orthophoto Segmentation

Before facilitating the downscaling using an optimization algorithm, the high-resolution true orthophoto image is segmented into discrete regions that are spatially homogeneous in terms of their material properties. This segmentation is achieved by first applying the SAM [40] to pixel values based on their B optical color bands (red, green, and blue), and potentially the NIR band. SAM output provides flat and independent segmentation masks over a given input image, even in the case that smaller segments are detected on top of a larger segment (e.g., chimneys on rooftops). SAM is also defined as a foundation model, as its deep neural network was trained on a diverse and extensive dataset to provide generalizable and zero-shot segmentation capabilities [40]. Moreover, it is capable of detecting segments even in the presence of noise and non-homogeneous feature variations (e.g., such as color variations in forestry areas), in comparison to the traditional segmentation methods (e.g., watershed). Using SAM, the true orthophoto is partitioned into L segments, i.e.,

{S_{1}, S_{2}, \dots, S_{L}}

. A given orthophoto j-th pixel is assigned to the segment

S_{k}

if

x_{j} \in S_{k} \Leftrightarrow ∥ x_{j} - {\bar{x}}_{k} ∥ < ∥ x_{j} - {\bar{x}}_{f} ∥ \forall f \neq k,

(2)

where

x_{j} \in R^{B}

is the j-th pixel feature vector, while

{\bar{x}}_{k}

and

{\bar{x}}_{f}

represent the mean feature vector of all pixels’ feature vectors assigned to

S_{k}

or

S_{f}

, respectively.

Next, contextual filtering is applied to further refine the segmentation, targeting small nonstatic objects (e.g., cars) that can negatively impact broadband albedo estimation for a given location. For this purpose, a cDSM raster image D is used. Each pixel in the cDSM contains information on the aboveground height and a basic classification into one of the three categories considered: building, ground, or vegetation. Other more precise object classes are not considered in this paper. A cDSM can be straightforwardly constructed from a classified LiDAR point cloud, where 3D points are classified into the three basic classes using a state-of-the-art semantic segmentation approach, such as PointNet++ [43] and the point transformer model [44]. However, some points may remain unclassified and are treated as noise, representing small objects such as cars. Each given segment

S_{k}

is checked if it satisfies the condition of having more than

T_{1} [%]

of its pixels classified as ground within D and having an area smaller than

T_{2}

m², where the following criterion is applied:

\frac{\sum_{j \in S_{k}} [D (j) = ground]}{| S_{k} |} > T_{1} and A (S_{k}) < T_{2},

(3)

where

[D (j) = ground]

equals 1 if the j-th pixel belonging to segment

S_{k}

is positioned within D, and

D (j)

is classified. Furthermore, the area of the

S_{k}

’ th segment is defined by

A (S_{k}) = | S_{k} | \cdot Δ^{2} [m^{2}

], where

Δ

[m] is the spatial resolution of the orthophoto pixel. If the given criteria are satisfied, then

S_{k}

is removed for further processing.

Afterwards, the well-known K-means clustering algorithm [45] is applied to the feature vectors of the L segments

\bar{x}

, reducing them to final K segments. K-means iteratively partitions the data by assigning each feature vector to the nearest cluster centroid then recalculating the centroids based on the mean of the assigned feature vectors. In this iterative process, until convergence, the segments with similar feature vectors will merge into the same segment. Centroids are randomly initialized using the K-Means++ seeding algorithm [46].

SAM often generates many fine-grained segments, which may over-represent the same material (e.g., multiple segments for spatially separated buildings with the same rooftop material). Reducing the segments condenses these similar regions into a single representative region, simplifying the computational complexity of the proposed method while retaining essential information. It should also be noted that the initial L segments are contiguous, whereas the reduced set of K segments is noncontiguous. A single segment in the reduced set can represent multiple spatially disjointed areas (e.g., separate rooftops in an urban location with similar optical properties due to identical building materials). Of course, a suitable K has to be chosen in order to achieve a trade-off between computational efficiency and high-quality representation of different surfaces with K segments, which will be explored and discussed further in the results section.

The satellite-derived albedo described earlier is generally available at a coarser resolution, represented by given pixels

{{α}_{i}}

, where each may include multiple high-resolution segments

S_{k}

from the true orthophoto of the same location, i.e.,:

Γ (α_{i}) = {S_{k} ∣ S_{k} \cap α_{i} \neq \emptyset},

(4)

where

Γ (α_{i})

is the set of all segments that overlap with the pixel

α_{i}

. The feature vectors of

Γ (α_{i})

are then defined as

{({\bar{x}}_{k}, n_{k, i}) ∣ S_{k} \in Γ (α_{i}), n_{k, i} = | S_{k} \cap α_{i} |},

(5)

where

n_{k, i}

represents the number of pixels in segment

S_{k}

that lie within the coarse-resolution pixel

α_{i}

. This linkage is crucial for the optimization approach described in the following, as segment-level estimates need to match the observed coarse-resolution albedo. An example of the proposed segmentation and linking of segments is shown in Figure 2. Generally, the contribution of each high-resolution true orthophoto segment to the optical properties of a lower-resolution satellite-based coarse pixel is directly proportional to the segment’s material type, as well as its spatial area of overlap with the coarse pixel spatial area.

2.2. Optimization-Based Downscaling

Differential Evolution (DE) is a population-based stochastic optimization method introduced by Storn and Price [41]. It is well suited for real-valued, non-linear, and multimodal optimization problems [47]. In this paper, DE is employed with self-adaptivity [48] to estimate optimal albedo values for K high-resolution segments. The goal is to minimize the discrepancy between downscaled albedo estimates and satellite-derived albedo values. At first, the population of candidate solutions is defined as [41]

P = {q_{1}, q_{2}, q_{l}, q_{l + 1} \dots, q_{M}},

(6)

where M is the population size. Each candidate solution vector

q_{l} \in R^{K}

encodes a complete set of albedo estimates for all high-resolution segments, where

q_{l, k} \in [0, 1]

is the albedo for the k-th segment. The initial population is generated randomly between [0 and 1], ensuring diversity and broad initial coverage of the search space. The optimization then mutates P, while also combining the best solutions iteratively for a maximum of G generations. In the continuation, various mutation and crossover strategies considered will be briefly described, as well as two newly proposed objective functions.

2.2.1. Mutation and Crossover Strategies

In the mutation phase, DE creates a mutant vector

v_{i}

for each candidate vector

q_{l}

. The goal is to introduce variability and explore new regions of the solution space. DE supports multiple mutation strategies [47]:

DE/rand/1 [41]: $v_{l} = q_{r_{1}} + F \cdot | q_{r_{2}} - q_{r_{3}} |$ , where $F \in [0, 1]$ is the mutation factor, and $q_{r_{*}}$ are distinct individuals randomly selected from the population. This strategy emphasizes exploration by relying on random solutions.
DE/best/1 [41]: $v_{l} = q_{best} + F \cdot | q_{r_{1}} - q_{r_{2}} |$ , where $q_{best}$ is the best candidate solution observed so far. By directing the search toward the best current solution, this strategy accelerates possible convergence.
DE/rand/2 [41]: $v_{l} = q_{r_{1}} + F \cdot | q_{r_{2}} - q_{r_{3}} | + F \cdot | q_{r_{4}} - q_{r_{5}} |$ , where two differences between four solutions improve diversity.
DE/best/2 [41]: $v_{l} = q_{best} + F \cdot | q_{r_{1}} - q_{r_{2}} | + F \cdot | q_{r_{3}} - q_{r_{4}} |$ ; this approach leverages the current best and multiple difference vectors to refine promising regions of the search space while maintaining moderate diversity.
DE/current-to-best/1: $v_{l} = q_{l} + F \cdot | q_{best} - q_{l} | + F \cdot | q_{r_{1}} - q_{r_{2}} |$ , this combines $q_{l}$ and $q_{best}$ , offering a balance between exploring new solutions and refining good ones.

The appropriate choice of mutation strategy and mutation factor F depends on the characteristics of the problem. More exploratory strategies (e.g., DE/rand/2) may find global optima more reliably, while more exploitative strategies (e.g., DE/best/1) can converge faster but may risk getting stuck in a local minimum. After mutation, DE generates a trial vector

u_{l}

by crossing the components of

v_{l}

(the mutant) and

q_{l}

. Crossover strategies influence the balance between the preservation of existing information and the introduction of new variations [47]:

Binomial (uniform) crossover [41]: $u_{l, k} = \{\begin{matrix} v_{l, k} & if r a n d (0, 1) \leq CR \\ q_{l, k} & otherwise \end{matrix}$ Here, $CR \in [0, 1]$ is the crossover rate, and $r a n d (0, 1)$ is a uniform random number. This replaces individual components with a probability governed by $C R$ , promoting diversity while retaining some of the structure of the parent solution.
Exponential crossover [41]: $u_{l, k} = \{\begin{matrix} v_{l, k} & if k = k_{r} or r a n d (0, 1) \leq CR \\ q_{l, k} & otherwise \end{matrix}$ Starting from a random index $k_{r}$ , exponential crossover replaces consecutive components from the mutant vector, potentially incorporating contiguous values of $v_{l}$ .

Finally, the new solution

u_{l}

is clamped in the range [0, 1], which is acceptable for albedo, that is,

u_{l, k} = max (0, min (1, u_{l, k}))

. The choice of strategy and DE parameters plays a crucial role. These parameters determine how solutions are combined and influence how much the algorithm searches new areas (exploration) or improves existing solutions (exploitation). Moreover, during each generation, the parameters F and

C R

evolve using a self-adaptivity approach [48]. For each individual

q_{l}^{g}

at generation g,

F_{l}^{g}

and

{CR}_{l}^{g}

are estimated, which are updated for the next-generation

g + 1

as follows [48]:

\begin{matrix} F_{l}^{g + 1} & = & \{\begin{matrix} 0.1 + rand (0, 1) \times 0.9 & if rand (0, 1) < τ_{1}, \\ F_{l}^{g} & otherwise, \end{matrix} \end{matrix}

(7)

\begin{matrix} {CR}_{l}^{g + 1} & = & \{\begin{matrix} rand (0, 1) & if rand (0, 1) < τ_{2}, \\ {CR}_{l}^{g} & otherwise, \end{matrix} \end{matrix}

(8)

where

τ_{1} = τ_{2} = 0.1

are small probabilities to evolve both parameters.

2.2.2. Proposed Objective Functions

After the end of generation, DE evaluates the trial solution vector

u_{l}

(that is, the estimated albedos of the segments) using an objective function that measures how closely the estimated downsampled (high-resolution) albedos match the lower-resolution satellite-derived albedos over the entire satellite image. For each coarse-resolution satellite-derived albedo pixel

α_{i}

, the following single-criterion objective error function

E_{1}

is defined, to be minimized over all coarse albedo pixels:

\begin{matrix} {\dot{α}}_{i} & = & \frac{\sum_{S_{k} \in Γ (α_{i})} u_{l, k} \cdot n_{k, i}}{\sum_{S_{k} \in Γ (α_{i})} n_{k, i}}, \end{matrix}

(9)

\begin{matrix} E_{1} & = & \sum_{i} |α_{i} - {\dot{α}}_{i}|, \end{matrix}

(10)

where

{\dot{α}}_{i}

is the estimated coarse albedo from the segments located within the given pixel. Here, the linkage described earlier between high-resolution segments and each coarse pixel is used.

The aforementioned approach uses a single coefficient

u_{l, k}

to directly estimate the albedo for the k-th segment. An alternative objective function

E_{2}

is also proposed that considers multiple coefficients to estimate albedo for each segment, estimating the albedo of the segment as a weighted sum of the values of the B bands in the true orthophoto. Introducing band-specific coefficients could potentially improve accuracy and accelerate the convergence of the DE algorithm, by capturing the unique spectral reflectance properties of different materials captured within the true orthophoto, which vary across visible and, if available, near-infrared wavelengths. This approach aligns with satellite-derived NTB albedo estimation, which relies on weighted contributions from the B bands (e.g., as defined in Equation (1) for Sentinel-2). Moreover, the averaged values from different bands are already available in the feature vector

{\bar{x}}_{k}

for the k-th segment. The DE vectors

q_{l}

,

v_{l}

,

u_{l}

are then extended from the

R^{K}

to the

R^{K \times (B + 1)}

space. The following single-criterion objective error function can then be defined:

\begin{matrix} u_{l, k} & = & {u_{l, k, b_{1}}, u_{l, k, b_{2}}, . . ., u_{l, k, b_{B + 1}}} \end{matrix}

(11)

\begin{matrix} {\dot{α}}_{i} & = & \frac{\sum_{S_{k} \in Γ (α_{i})} min (\sum_{t = 1}^{B} (u_{l, k, b_{t}} \cdot {\bar{x}}_{k, b_{t}} \cdot n_{k, i}) + u_{l, k, b_{B + 1}}, 1)}{\sum_{S_{k} \in Γ (α_{i})} n_{k, i}} \end{matrix}

(12)

\begin{matrix} E_{2} & = & \sum_{i} |α_{i} - {\dot{α}}_{i}|, \end{matrix}

(13)

where

u_{l, k, b_{B + 1}} \in [0, 1]

is an additional bias parameter, since the number of bands in the true orthophoto is generally smaller than the number of bands in a coarse multispectral satellite image.

If the trial solution vector

u_{l}

yields a lower E than the target vector

q_{l}

, the trial replaces the target within the population in the next generation. This selection mechanism ensures that high-quality solutions propagate, while poor solutions phase out over time. The DE mutation, crossover, selection, and evaluation iterate until a stopping criterion is met, such as reaching a maximum number of generations G or triggering an early stopping if acceptable convergence is reached. If considering

E_{1}

, the final high-resolution albedo for the j-th pixel in ortophoto can be defined as

\begin{matrix} {\hat{α}}_{j} & = & u_{b e s t, k} \end{matrix}

(14)

\begin{matrix} k & = & arg min_{f \neq j} ∥ x_{j} - {\bar{x}}_{f} ∥, \end{matrix}

(15)

with

u_{b e s t}

being the best candidate solution from the DE optimization when minimizing

E_{1}

, and

u_{b e s t, k}

the estimated albedo for k-th segment. On the other hand, when considering

E_{2}

, the equation changes to

{\hat{α}}_{j} = min (\sum_{t = 1}^{B} (u_{b e s t, k, b_{t}} \cdot {\bar{x}}_{k, b_{t}} \cdot n_{k, i}) + u_{b e s t, k, b_{B + 1}}, 1),

(16)

which considers additional per-band coefficients for

u_{b e s t, k}

. By adjusting all the method’s parameters (K, M, F,

CR

and G) and choosing suitable DE strategies, the total error of the estimated high-resolution albedo can be reduced substantially, as will be presented within the next section.

3. Results

The results of the experiments are presented in five subsections, namely the input acquisition and segmentation of the true orthophoto of the considered study area, the sensitivity analysis of DE strategies for downscaling the isotropic broadband albedo, as well as the spatiospectral analysis of the best optimization configuration. Finally, validation with ground truth data and extension to other study areas are presented.

3.1. Data Acquisition and Segmentation

The proposed method was tested in the urban part of Maribor city, Slovenia (46.5596938′N, 15.6408193′E), where the input considered a high-resolution aerial orthophoto that spans a 0.75 km² area, with a spatial resolution of 0.1 m per pixel, as shown in Figure 3a,b for the VIS and NIR bands, respectively. The aerial photograph was obtained using the UltraCam Eagle M3 photogrammetric camera [49] and was truly orthorectified by the Geodetic Institute of Slovenia. For the same location, the Sentinel-2 Level-2A BoA [38] input processed visible VIS, NIR, and two SWIR bands, which are shown in Figure 3c–f, respectively. The information of the spectral bands considered for downscaling, both from UltraCam and Sentinel-2, is shown in Table 1. The cDSM obtained from the Light Detection And Ranging (LiDAR) point cloud data, which are used for contextual filtering, is shown in Figure 3g. The estimated isotropic broadband albedo in the Sentinel-2 bands considered is shown in Figure 3h, where the range of the albedo is from 0.02 to 0.54, which is expected for the materials used in the given urban environment. In order to minimize the effects of shadowing on broadband albedo downsampling using the proposed method, the orthophoto and Sentinel-2 images were acquired nearly on the same date and time, where the orthophoto was acquired on 6 April 2022 at 12:15 and the Sentinel-2 cloud-free image on 7 April 2022 at 11:57. The number of pixels for the Sentinel-2 image was

75 \times 100

, while for the true orthophoto, it was 7500 × 10,000.

For the experiments that follow, an AMD Ryzen 7950X CPU with 64 GB memory and a Nvidia RTX 4090 GPU with 24 GB memory were used. Due to the large resolution of the input orthophoto image, zero-shot SAM inference had to be performed multiple times with overlapping tiles, where a single tile of size

3000 \times 3000

pixels took all of the 24 GB available GPU memory. The result of SAM segmentation on true orthophoto VIS and NIR bands is shown in Figure 4. The input parameters considered for SAM are presented in Table 2. After SAM segmentation was complete, the output was downsampled to a spatial resolution of 1 m per pixel, and contextual filtering using cDSM (i.e., Figure 3g) was applied with

T_{1} = 0.75

and

T_{2} = 2.5

. The K-means was run on filtered segmentation for three different scenarios, namely

K = 10

,

K = 50

, and

K = 100

. The results are shown in Figure 5, where the first row represents the detected K segments distinguished by unique colors, while the second and third rows represent the pixels being equal to the (averaged) feature vectors belonging to the K segments for the VIS and NIR bands, respectively.

3.2. Sensitivity Analysis of DE Strategies and Downscaling Results

Using the proposed method, all the aforementioned DE strategies were considered in a grid search approach, involving different mutation and crossover operators, to downscale lower-resolution albedo from the Sentinel-2 image (i.e., Figure 3h) with a spatial resolution of 10 m pixel to a high resolution of 1 m. The population size considered was

M = 100

. The results are presented in Figure 6, for the three scenarios of K and both objective functions

E_{1}

and

E_{2}

, where their value is minimized over generations. The number of generations was set to reach convergence, where the change in

E_{1}

or

E_{2}

was less than

δ = 10^{- 5}

between 10 consecutive generations. As expected,

E_{1}

and the lesser K demanded fewer generations because they are less computationally intensive. The best strategies from Figure 6 are summarized in Table 3, where the root mean square error (RMSE) and the mean absolute error (MAE) are also estimated. MAE can be directly estimated by dividing

E_{1}

or

E_{2}

by the number of pixels of the given Sentinel-2-derived isotropic broadband albedo image, i.e.,

75 \times 100

. Furthermore, the mean absolute percentage error (MAPE) is estimated as an extension of MAE, calculated as

\frac{1}{75 \times 100} \sum_{i} \frac{| α_{i} - {\dot{α}}_{i} |}{α_{i}}

. Although

E_{2}

introduces more parameters (band-specific weights and bias) than

E_{1}

, to have greater flexibility, in practice, the VIS and NIR bands can be strongly correlated for many types of urban materials. Consequently, the DE algorithm converges to a similar value of high-resolution albedo to

E_{1}

, resulting in nearly identical final estimates. However, when considering

E_{2}

, DE must explore a larger parameter space to adjust multiple coefficients per segment; hence, it generally requires more generations to converge.

In more heterogeneous settings, particularly where additional spectral bands are available,

E_{2}

could capture subtle reflectance variations that are less pronounced in the VIS bands. Such scenarios might include regions featuring composite roofing materials. In these contexts, the finer spectral granularity provided by

E_{2}

could outweigh the additional computational cost. On the other hand, for urban areas that are largely composed of only a few principal surface types, the simpler objective function

E_{1}

remains a more time-efficient solution that still provides robust albedo estimates.

For further analysis of the results, the visual change in the estimated high-resolution albedo and the corresponding error per pixel

E_{2}

, through the increased number of DE generations, is shown in Figure 7, when considering the best DE configuration (see Table 4), which converges to an RMSE of 0.0253 and an MAPE of 13.40% after 5000 generations. As can be observed, the lowest error is present over larger segments with highly uniform optical properties such as river, where the albedo is close to 0, while the highest error occurs due to few very reflective rooftops’ material glazing. A direct comparison between the best downscaled result and the original Sentinel-2 NTB-based albedo is presented in Figure 8. A notable interpolated per-pixel absolute difference between the input lower-resolution albedo (Figure 8a) and the downscaled high-resolution albedo (Figure 8b) is highlighted in Figure 8c. The cubic method was used for the interpolation of absolute differences from 10 to 1 m. Most differences occur at the Sentinel-2 subpixel resolution (i.e., below 10 m), where transitions between surfaces with different materials occur. The largest such transitions occur between buildings and roads, or between ground and water. Additionally, as can be observed in the top left corner, few variations arise in locations with more reflective surfaces, since their anisotropic reflectance was different when the true orthophoto was acquired in comparison to the Sentinel-2 image; hence, the underlying segments can be misrepresented.

Figure 9 illustrates a band-by-band evaluation of the downscaled albedo against the reference Sentinel-2 NTB-derived albedo. Figure 9a presents the contribution of the red VIS band and absolute differences, highlighting the noticeable residuals at the boundaries where vegetation meets built-up urban surfaces. Figure 9b shows the green VIS band, similarly, illustrating that the transitions between vegetated and urban materials drive visible discrepancies. Figure 9c focuses on the blue VIS band, where the disparities are the smallest, while Figure 9d covers the NIR band, showing slightly more pronounced deviations, attributable to its higher sensitivity to vegetation canopies and certain roofing materials, as well as subtle changes in plant moisture content that can affect reflectance [50]. The influence of the broad spectral coverage of the Sentinel-2 SWIR channels is captured through the bias term in

E_{2}

(see Equation (12)), which cannot be explicitly mapped to any orthophoto band for spatiospectral analysis. Overall, the absolute differences observed per VIS band remain below 0.09 and in the NIR band below 0.18, confirming the robustness of the proposed downscaling method.

3.3. Validation with Ground Truth Data

Finally, the estimated high-resolution isotropic albedo shown in Figure 7c was compared to the manually annotated ground truth albedo throughout the location considered in Maribor, as can be shown in Figure 10. The broadband isotropic albedo was manually annotated within the minimum and maximum limits for given characteristic material types for the given area, based on the known values in the literature, listed in Table 5. The albedo values can change over time due to erosion and other environmental effects on materials; therefore, a range is considered instead of a single value. For the given study location, impervious and pervious asphalt and concrete were considered present, with the additional possibility of aging. Aged asphalt generally lightens with an increase in albedo, while aged concrete generally darkens with a decrease in albedo [36]. For trees and shrubs, the albedo of deciduous broadleaf vegetation was considered [51], as well as a wide range of possible metal and clay roofing materials that can induce higher reflectivity [37]. Hence, the given estimation of the j-th pixel error using the objective function

E_{2}

in Figure 10 was carried out as follows:

\begin{matrix} | {\hat{α}}_{j} - a_{j, \min}^{'} |, & if {\hat{α}}_{j} < a_{j, \min}^{'}, \end{matrix}

(17)

\begin{matrix} | {\hat{α}}_{j} - a_{j, \max}^{'} |, & if {\hat{α}}_{j} \geq a_{j, \max}^{'}, \end{matrix}

(18)

where [

a_{j, \min}^{'}

,

a_{j, \max}^{'}

] is the manually defined albedo range for the j-th pixel. It should be noted that this validation was performed using estimated albedo values at the original resolution of 1 m, that is,

{\hat{α}}_{i}

(see Equation (16)).

The proposed method was applied to the ground truth data, resulting in an overall MAE of

0.0062

, anRMSE of

0.0179

, and a MAPE of 4.0299% when considering all ground truth pixels. Furthermore, a comparison with two super-resolution deep learning models is provided, namely with Enhanced Super-Resolution Generative Adversarial Networks (ESRGANs) [56] and the Blind Super-Resolution Generative Adversarial Network (BSRGAN) [57]. For both super-resolution models, the 4× upscaling trained model was considered with default inputs and no additional training. To achieve 1 m target resolution, the Sentinel-2 NTB-derived albedo image was converted to grayscale and upscaled twice with a 4× model and then downsampled to 1 m. ESRGAN achieved an overall MAE of

0.0125

, an RMSE of

0.0285

, and an MAPE of 9.2778%, while BSRGAN achieved an MAE of

0.0164

, an RMSE of

0.0341

, and an MAPE of 12.3104%. A visual comparison between the proposed method and the two super-resolution models is shown in Figure 11, where the estimated broadband albedo is shown at 1 m, alongside the absolute differences per pixel based on ground truth data. Numerical estimates of RMSE and MAPE for each material in the ground truth data are provided in Table 6.

The results confirm that the proposed method outperforms both ESRGAN and BSRGAN, particularly in regions dominated by sub-10 m objects with heterogeneous material properties. In contrast, for more homogeneous areas that generally exceed 10 m in spatial extent (e.g., homogeneous asphalt and clay surfaces), all three methods perform comparably. This performance gap in fine-scale heterogeneous regions can be attributed primarily to the reliance of super-resolution models on extensive training datasets in order to generalize well. In practice, such super-resolution models learn to infer high-resolution details by recognizing repeating patterns seen in training data. Consequently, when subpixel mixing of distinct material types occurs, especially those that occur in urban environments, such models may struggle to accurately reconstruct the true spatial or spectral variations, which also contributes to a higher error in the parameter (e.g., albedo) being downscaled. In contrast, the optimization-based method in this work does not require training data or any prior assumptions about the local material mixture. Instead, it is applied directly to the high-resolution true orthophoto, while it enforces estimated albedo consistency with the coarse satellite albedo. All in all, super-resolution models are still useful when no additional high-resolution reference data are available.

3.4. Application to Other Study Areas

For generalizability purposes, the proposed method was applied to two other nearby study areas, as shown in Figure 12. The first location (46.492583′N, 15.644297′E) represents a suburban region with forest areas (Figure 12a), while the second location (46.474313′N, 15.683701′E) is a rural region with agricultural fields and an airport runway (Figure 12b). Both locations were acquired during the new LiDAR scan of Slovenia on 15 March 2024 close to noon, where an aerial true orthophoto was also obtained simultaneously. The closest cloud-free Sentinel-2 image available was acquired on 20 March 2024 at 11:07, to be considered for NTB-based albedo estimation. For SAM segmentation and DE optimization, the same configuration as defined in Table 2 and Table 4 was used. The value of the objective function

E_{2}

converged at

140.43

for the first location and

100.84

for the second. It is not surprising that the albedo matches the coarser albedo derived from NTB better for the second location, since there are more large and homogeneous agricultural areas in the area. The absolute differences per pixel, shown in Figure 12, clearly present a lower error for agricultural and forestry areas, while some highly reflective materials on the roofs show a higher error, similar to that in the urban study area.

4. Discussion

The use of SAM significantly improved segmentation quality by effectively capturing even fine-grained urban features with various optical properties. The K-means clustering step further refined these segments, consolidating similarly reflective surfaces into representative segments. Together, these steps helped ensure that the proposed DE-based downscaling objective functions accurately considered the heterogeneity of materials within an urban scene. By segmenting the image directly based on its pixel-level spectral characteristics, the SAM with K-means workflow creates detailed regions without relying on pre-existing land cover maps. SAM delineates boundaries for even the smallest and most visually subtle features, capturing heterogeneous urban surfaces (e.g., tree crowns, asphalt roads, and reflective roofs) at very fine resolution. K-means then clusters these initial segments according to spectral similarity, producing groupings that effectively act as high-accuracy land cover “class labels” but are fully data-driven. Hence, these labels emerge from the image itself rather than from a generalized, potentially coarser external classification. SAM with K-means can reveal sub-Sentinel2 pixel variability and fine-scale differences in materials, which is particularly critical at 1 m resolution. In this way, the proposed workflow inherently replaces or surpasses the need to improve a traditional land cover product such as ESA WorldCover [16].

Furthermore, the results indicated that only a small number of final segments are required when the location surfaces are composed of relatively few distinct materials. This is not surprising, as there are eight most dominant surface types (and corresponding near-Lambertian albedo properties) present in the given urban location (i.e., asphalt, concrete, clay, steel, grass, sand, brick, and water), based on the conducted manual survey. Moreover, the use of a more complex objective function, such as

E_{2}

, provided only marginal improvements in the estimated high-resolution albedo due to the strong correlation between VIS and NIR bands for most common materials in the urban environment. Consequently, a simpler objective function

E_{1}

can be employed in most cases, reducing the required number of generations for convergence.

Unlike machine learning and deep learning methods (e.g., super-resolution networks), which depend on pre-existing high-resolution image datasets for training, the proposed method directly estimates finer-scale albedo at the segment level using DE. This enables application to diverse geographic locations without the need for region-specific trained models.

4.1. Limitations

Despite the high-resolution gains, the proposed downscaling method has limitations. From a computational perspective, scaling an optimization-based approach to large areas at 1 m resolution can be challenging. Although the method eliminates the need for high-resolution training data, optimization itself can be computationally expensive and challenging to parallelize. In addition, precise co-registration of the 1 m true orthophoto pixels with coarse albedo pixels is required; any misalignment can degrade the solution as fine-scale estimates no longer correctly sum to the coarse constraints.

Methodologically, the accuracy of the downscaling is also tied to the reliability of the image segmentation based on SAM and K-Means. The approach assumes that each segment represents a homogeneous surface. In practice, over-segmentation or under-segmentation can introduce downscaling errors. If a segment contains mixed land cover types or distinct illumination conditions, the single albedo value assigned cannot represent all sub-pixels correctly. Such segmentation errors propagate to optimization, yielding biased albedo estimates for those regions. Although not explored in this work, the homogeneity assumption can also break down at sub 1 m scales due to natural variability in materials, for example, soil moisture variations, weathering effects on roofs, or heterogeneous vegetation within what appears as one canopy. The results obtained suggest that optimization-based downscaling is most effective when the number of segments aligns with the actual variety of distinct surface materials in the geographic area considered. Despite this, an oversensitivity to reflective surfaces was observed, which were downscaled from the Sentinel-2 NTB-derived isotropic albedo that did not take such anisotropy into account. More pronounced errors were also obtained for shadowed segments when validated with ground truth data.

The proposed method also relies on the availability of high-resolution ancillary data (e.g., 1 m true orthophotos and cDSMs) for segmentation and downscaling. Such data may not always be up to date or globally available, restricting the application domain. Inherently, the proposed method’s isotropy assumption ignores the anisotropy of the reflectance of the segments and assumes that the reflectance of each material is invariant with the viewing/illumination geometry, which hinders the method’s applicability to more complex applications than the physical models mentioned in the introduction. The challenges of shadowing and anisotropy effects at 1 m are beyond the scope of this paper and are discussed in the continuation.

4.2. Shadowing and Anisotropy Effects at 1 m

As noted in the motivation of this work, a major challenge is the retrieval of anisotropic (angle-dependent) albedo at 1 m resolution. The proposed downscaling method assumes isotropic reflectance, effectively treating surfaces as Lambertian. This assumption is made because current data do not support resolving the full bidirectional reflectance distribution at such scales. Most medium- and high-resolution multispectral sensors provide only a single near-nadir view of any given location, so there is a lack of multi-angular sampling needed to characterize reflectance anisotropy. In contrast, coarse 500 m albedo products like MODIS MCD43 are derived by combining multiple snapshots over a 16-day period to sample different viewing and illumination angles. At 1 m, no comparable multi-angle imaging time series exists. Furthermore, temporal downscaling is not feasible at this scale because surfaces (and shadows) change rapidly and because high-resolution images are typically not taken from systematically varying angles. Another barrier is the lack of worldwide sub-10 m land cover classification and BRDF information to bridge between 500 m coarse anisotropy models and 1 m fine surface details, similar to the recent methodology proposed by Chen et al. [17]. Moreover, existing worldwide Land Use Land Cover (LULC) maps [58] contain very broad classes that may also be infeasible to consider at 1 m resolution, as, for example, urban areas are denoted using the “built area” class irrespective of surface material properties.

Shadowing effects can significantly further complicate anisotropy at 1 m in comparison to the isotropic case. In coarse pixels, the influence of sub-10 m shadows can be averaged out, while at 1 m resolution, a given 1 m pixel might be fully illuminated at a given sun angle and fully shadowed at a slightly different angle. Moreover, based on the literature analysis by the authors, no validated reference database exists for the BRDF of surfaces in shadowy or non-shadowy conditions at such a fine scale for a larger geographical area.

Looking ahead, several novel developments could enable anisotropic reflectance mapping at finer spatial resolutions. One solution is UAV-based multi-angular sampling; instrumented drones can collect multispectral images of the same area from many angles and even different sun positions, enabling the construction of a fine-scale local BRDF database. On the satellite side, future missions may explicitly trade swath width or revisit for angular diversity at a higher resolution. Moreover, constellations of small satellites could coordinate observations from multiple angles to start capturing high-resolution anisotropy. Finally, a promising direction lies in leveraging deep learning models that are trained on physics-based simulations of anisotropic reflectance under both shadowy and non-shadowy conditions, provided sufficient 3D surface information is available (e.g., high-resolution 3D models derived from airborne LiDAR or other multispectral data sources with ground truth material albedo and emissivity annotation). By integrating detailed geometry and spectral parameters into the training process, these models could more accurately capture the complex interactions of various illumination conditions on different material compositions.

5. Conclusions

This paper presents a novel method for estimating high-resolution isotropic solar albedo using optimization-based downscaling from satellite-derived broadband albedo to true orthophoto images. The primary novelties of the proposed method include the use of the deep learning-based Segment Anything Model (SAM) and K-means segmentation, combined with contextual filtering, to reduce computational complexity when searching for the global minimum in the solution space of the downscaling-based objective function. The optimization was performed using the single-objective Differential Evolution (DE) optimization algorithm for floating point values, employing various mutation and crossover strategies. In terms of quantitative performance, the proposed method achieved a root mean square error (RMSE) of 0.0179 and a mean absolute percentage error (MAPE) of 4.0299% against ground truth annotations of the isotropic broadband albedo for characteristic materials in the urban study area considered. These results outperform the Enhanced Super-Resolution Generative Adversarial Networks (ESRGANs), which achieved an RMSE of 0.0285 and an MAPE of 9.2778%, as well as the Blind Super-Resolution Generative Adversarial Network (BSRGAN), which achieved an RMSE of 0.0341 and an MAPE of 12.3104%. These findings highlight the effectiveness of the proposed method in accurately estimating the high-resolution isotropic broadband albedo.

Furthermore, the ability of the proposed method to generate high-resolution isotropic solar albedo maps will support a wide range of physical simulations and models, refining the estimations of reflective irradiance, land surface temperature (LST), and urban heat islands (UHIs). Moreover, the proposed method could be modified to downscale other satellite-derived parameters (e.g., vegetation indices, soil moisture, LST, etc.) to a higher resolution by leveraging the demonstrated approaches of segmentation, contextual filtering, and optimization.

The remaining challenge for future work is the downscaling of BRDF-based satellite-derived albedo products to 1 m, which would accurately capture surfaces’ reflectance anisotropy, as well as correction of shadowed areas in high-resolution albedo estimation. In further future research, integrating multi-temporal data could allow for spatiotemporal insights, capturing seasonal variations or rapid changes in urban surfaces. Additionally, expanding the method to incorporate other data sources, such as higher-resolution hyperspectral or thermal infrared imagery, increasingly available from unmanned aerial vehicles (UAVs), may reveal subtle reflectance or emissivity variations pertinent to specific building materials or vegetation canopies. Finally, investigation into more efficient optimization algorithms or parallelization schemes would help manage the increased complexity when extending the method to larger and more diverse geographical areas.

Author Contributions

Conceptualization, N.L. and M.B.; Methodology, N.L. and M.B.; Software, N.L.; Validation, N.L.; Visualization, N.L.; Writing—original draft, N.L.; Writing—review and editing, D.M. and M.B.; Resources, D.M.; Supervision, D.M.; Funding acquisition, N.L. and D.M. All authors have read and agreed to the published version of the manuscript.

Funding

We acknowledge financial support provided by the Slovenian Research and Innovation Agency (Research Project No. J7-50095 as the main funding source, with additional support from Research Programme Core Funding No. P2-0041, and Target Research Programme No. V2-2390).

Data Availability Statement

The input LiDAR and orthophoto data are available from Slovenian Environment Agency at http://gis.arso.gov.si/evode/profile.aspx (accessed on 10 December 2024), and from the Survey and Mapping Authority of the Republic of Slovenia at https://clss.si/ (accessed on 25 March 2024). Sentinel-2 Level-2A data are available via Google Earth Engine at https://developers.google.com/earth-engine/datasets/catalog/sentinel-2 (accessed on 10 December 2024). The annotated albedo ground truth data and downscaled albedo values are available on request.

Acknowledgments

We are thankful to the Slovenian Environment Agency and Surveying and Mapping Authority of the Republic of Slovenia for providing cyclical aerial surveying photographs and LiDAR data. We are thankful to the European Commission for providing open Sentinel-2 Level-2A satellite imagery. We are grateful to the Geodetic Institute of Slovenia for preparing high-resolution true ortophotos from cyclical aerial surveying data.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhao, Q.; Yu, L.; Du, Z.; Peng, D.; Hao, P.; Zhang, Y.; Gong, P. An Overview of the Applications of Earth Observation Satellite Data: Impacts and Future Trends. Remote Sens. 2022, 14, 1863. [Google Scholar] [CrossRef]
Roesch, A.; Wild, M.; Pinker, R.; Ohmura, A. Comparison of spectral surface albedos and their impact on the general circulation model simulated surface climate. J. Geophys. Res. Atmos. 2002, 107, ACL 13-1–ACL 13-18. [Google Scholar] [CrossRef]
Small, C. Comparative analysis of urban reflectance and surface temperature. Remote Sens. Environ. 2006, 104, 168–189. [Google Scholar] [CrossRef]
Yuan, S.; Liu, Y.; Liu, Y.; Zhang, K.; Li, Y.; Enwer, R.; Li, Y.; Hu, Q. Spatiotemporal variations of surface albedo in Central Asia and its influencing factors and confirmatory path analysis during the 21st century. Int. J. Appl. Earth Obs. Geoinf. 2024, 134, 104233. [Google Scholar] [CrossRef]
Sailor, D.J.; Resh, K.; Segura, D. Field measurement of albedo for limited extent test surfaces. Sol. Energy 2006, 80, 589–599. [Google Scholar] [CrossRef]
Hofierka, J.; Onačillová, K. Estimating Visible Band Albedo from Aerial Orthophotographs in Urban Areas. Remote Sens. 2021, 14, 164. [Google Scholar] [CrossRef]
Liang, S.; Shuey, C.J.; Russ, A.L.; Fang, H.; Chen, M.; Walthall, C.L.; Daughtry, C.S.; Hunt, R. Narrowband to broadband conversions of land surface albedo: II. Validation. Remote Sens. Environ. 2003, 84, 25–41. [Google Scholar] [CrossRef]
Qu, Y.; Liang, S.; Liu, Q.; He, T.; Liu, S.; Li, X. Mapping Surface Broadband Albedo from Satellite Observations: A Review of Literatures on Algorithms and Products. Remote Sens. 2015, 7, 990–1020. [Google Scholar] [CrossRef]
Li, Z.; Erb, A.; Sun, Q.; Liu, Y.; Shuai, Y.; Wang, Z.; Boucher, P.; Schaaf, C. Preliminary assessment of 20-m surface albedo retrievals from sentinel-2A surface reflectance and MODIS/VIIRS surface anisotropy measures. Remote Sens. Environ. 2018, 217, 352–365. [Google Scholar] [CrossRef]
Shuai, Y.; Masek, J.G.; Gao, F.; Schaaf, C.B.; He, T. An approach for the long-term 30-m land surface snow-free albedo retrieval from historic Landsat surface reflectance and MODIS-based a priori anisotropy knowledge. Remote Sens. Environ. 2014, 152, 467–479. [Google Scholar] [CrossRef]
Bonafoni, S.; Sekertekin, A. Albedo Retrieval From Sentinel-2 by New Narrow-to-Broadband Conversion Coefficients. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1618–1622. [Google Scholar] [CrossRef]
Guerri, G.; Crisci, A.; Messeri, A.; Congedo, L.; Munafò, M.; Morabito, M. Thermal Summer Diurnal Hot-Spot Analysis: The Role of Local Urban Features Layers. Remote Sens. 2021, 13, 538. [Google Scholar] [CrossRef]
Calhoun, Z.D.; Willard, F.; Ge, C.; Rodriguez, C.; Bergin, M.; Carlson, D. Estimating the effects of vegetation and increased albedo on the urban heat island effect with spatial causal inference. Sci. Rep. 2024, 14, 540. [Google Scholar] [CrossRef]
Karimi, A.; Moreno-Rangel, D.; García-Martínez, A. Granular mapping of UHI and heatwave effects: Implications for building performance and urban resilience. Build. Environ. 2025, 273, 112705. [Google Scholar] [CrossRef]
Lin, X.; Wu, S.; Chen, B.; Lin, Z.; Yan, Z.; Chen, X.; Yin, G.; You, D.; Wen, J.; Liu, Q.; et al. Estimating 10-m land surface albedo from Sentinel-2 satellite observations using a direct estimation approach with Google Earth Engine. ISPRS J. Photogramm. Remote Sens. 2022, 194, 1–20. [Google Scholar] [CrossRef]
Zanaga, D.; Van De Kerchove, R.; Daems, D.; De Keersmaecker, W.; Brockmann, C.; Kirches, G.; Wevers, J.; Cartus, O.; Santoro, M.; Fritz, S.; et al. ESA WorldCover 10 m 2021 v200. 2022. Available online: https://zenodo.org/records/7254221 (accessed on 20 December 2024).
Chen, H.; Lin, X.; Sun, Y.; Wen, J.; Wu, X.; You, D.; Cheng, J.; Zhang, Z.; Zhang, Z.; Wu, C.; et al. Performance Assessment of Four Data-Driven Machine Learning Models: A Case to Generate Sentinel-2 Albedo at 10 Meters. Remote Sens. 2023, 15, 2684. [Google Scholar] [CrossRef]
Canisius, F.; Wang, S.; Croft, H.; Leblanc, S.G.; Russell, H.A.J.; Chen, J.; Wang, R. A UAV-Based Sensor System for Measuring Land Surface Albedo: Tested over a Boreal Peatland Ecosystem. Drones 2019, 3, 27. [Google Scholar] [CrossRef]
Xu, X.; Asawa, T.; Kobayashi, H. Narrow-to-Broadband Conversion for Albedo Estimation on Urban Surfaces by UAV-Based Multispectral Camera. Remote Sens. 2020, 12, 2214. [Google Scholar] [CrossRef]
Buster, G.; Rossol, M.; Maclaurin, G.; Xie, Y.; Sengupta, M. A physical downscaling algorithm for the generation of high-resolution spatiotemporal solar irradiance data. Sol. Energy 2021, 216, 508–517. [Google Scholar] [CrossRef]
Karalasingham, S.; Deo, R.C.; Casillas-Perez, D.; Raj, N.; Salcedo-Sanz, S. Downscaling Surface Albedo to Higher Spatial Resolutions With an Image Super-Resolution Approach and PROBA-V Satellite Images. IEEE Access 2023, 11, 5558–5577. [Google Scholar] [CrossRef]
Chen, J.; Sciusco, P.; Ouyang, Z.; Zhang, R.; Henebry, G.M.; John, R.; Roy, D.P. Linear downscaling from MODIS to landsat: Connecting landscape composition with ecosystem functions. Landsc. Ecol. 2019, 34, 2917–2934. [Google Scholar] [CrossRef]
Zhang, X.; Jiao, Z.; Zhao, C.; Qu, Y.; Liu, Q.; Zhang, H.; Tong, Y.; Wang, C.; Li, S.; Guo, J.; et al. Review of Land Surface Albedo: Variance Characteristics, Climate Effect and Management Strategy. Remote Sens. 2022, 14, 1382. [Google Scholar] [CrossRef]
Wang, N.; Yue, Z.; Liu, Y.; Liu, Y. Machine learning potentials for global multi-timescale diffuse irradiance estimation: Synthesizing ground observations, time-series, and environmental features. Energy 2024, 306, 132535. [Google Scholar] [CrossRef]
Liu, B.Y.H.; Jordan, R.C. The interrelationship and characteristic distribution of direct, diffuse and total solar radiation. Sol. Energy 1960, 4, 1–19. [Google Scholar] [CrossRef]
Szabó, S.; Enyedi, P.; Horváth, M.; Kovács, Z.; Burai, P.; Csoknyai, T.; Szabó, G. Automated registration of potential locations for solar energy production with Light Detection And Ranging (LiDAR) and small format photogrammetry. J. Clean. Prod. 2016, 112, 3820–3829. [Google Scholar] [CrossRef]
Mainzer, K.; Killinger, S.; McKenna, R.; Fichtner, W. Assessment of rooftop photovoltaic potentials at the urban level using publicly available geodata and image recognition techniques. Sol. Energy 2017, 155, 561–573. [Google Scholar] [CrossRef]
Assouline, D.; Mohajeri, N.; Scartezzini, J.L. Large-scale rooftop solar photovoltaic technical potential estimation using Random Forests. Appl. Energy 2018, 217, 189–211. [Google Scholar] [CrossRef]
Boccalatte, A.; Jha, A.; Chanussot, J. Leveraging large-scale aerial data for accurate urban rooftop solar potential estimation via multitask learning. Sol. Energy 2025, 290, 113336. [Google Scholar] [CrossRef]
Stephens, G.L.; O’Brien, D.; Webster, P.J.; Pilewski, P.; Kato, S.; Li, J. The albedo of Earth. Rev. Geophys. 2015, 53, 141–163. [Google Scholar] [CrossRef]
Jakubiec, J.A.; Reinhart, C.F. A method for predicting city-wide electricity gains from photovoltaic panels based on LiDAR and GIS data combined with hourly Daysim simulations. Sol. Energy 2013, 93, 127–143. [Google Scholar] [CrossRef]
Lukač, N.; Špelič, D.; Štumberger, G.; Žalik, B. Optimisation for large-scale photovoltaic arrays’ placement based on Light Detection And Ranging data. Appl. Energy 2020, 263, 114592. [Google Scholar] [CrossRef]
Bizjak, M.; Žalik, B.; Štumberger, G.; Lukač, N. Large-scale estimation of buildings’ thermal load using LiDAR data. Energy Build. 2021, 231, 110626. [Google Scholar] [CrossRef]
Hofierka, J.; Gallay, M.; Onačillová, K.; Hofierka, J. Physically-based land surface temperature modeling in urban areas using a 3-D city model and multispectral satellite data. Urban Clim. 2020, 31, 100566. [Google Scholar] [CrossRef]
Hofierka, J.; Bogľarský, J.; Kolečanský, Š.; Enderova, A. Modeling Diurnal Changes in Land Surface Temperature in Urban Areas under Cloudy Conditions. ISPRS Int. J. Geo-Inf. 2020, 9, 534. [Google Scholar] [CrossRef]
García Mainieri, J.J.; Sen, S.; Roesler, J.; Al-Qadi, I.L. Albedo Change Mechanism of Asphalt Concrete Surfaces. Transp. Res. Rec. 2022, 2676, 763–772. [Google Scholar] [CrossRef]
Di Giuseppe, E.; Sabbatini, S.; Cozzolino, N.; Stipa, P.; D’Orazio, M. Optical properties of traditional clay tiles for ventilated roofs and implication on roof thermal performance. J. Build. Phys. 2019, 42, 484–505. [Google Scholar] [CrossRef]
ESA. Sentinel-2 User Handbook; ESA: Paris, France, 2015. [Google Scholar]
Habib, A.F.; Kim, E.M.; Kim, C.J. New Methodologies for True Orthophoto Generation. Photogramm. Eng. Remote Sens. 2007, 73, 25–36. [Google Scholar] [CrossRef]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–3 October 2023; pp. 3992–4003. [Google Scholar] [CrossRef]
Storn, R.; Price, K. Differential Evolution—A simple and efficient adaptive scheme for global optimization over continuous spaces. Int. Comput. Sci. Inst. 1995, 11, 341–359. [Google Scholar]
Gascon, F.; Bouzinac, C.; Thépaut, O.; Jung, M.; Francesconi, B.; Louis, J.; Lonjou, V.; Lafrance, B.; Massera, S.; Gaudel-Vacaresse, A.; et al. Copernicus Sentinel-2A Calibration and Products Validation Status. Remote Sens. 2017, 9, 584. [Google Scholar] [CrossRef]
Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Wu, X.; Jiang, L.; Wang, P.S.; Liu, Z.; Liu, X.; Qiao, Y.; Ouyang, W.; He, T.; Zhao, H. Point Transformer V3: Simpler, Faster, Stronger. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–21 June 2024; pp. 4840–4851. [Google Scholar] [CrossRef]
Ahmed, M.; Seraj, R.; Islam, S.M.S. The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics 2020, 9, 1295. [Google Scholar] [CrossRef]
Arthur, D.; Vassilvitskii, S. k-Means++: The Advantages of Careful Seeding; Stanford University: Standford, CA, USA, 2006. [Google Scholar]
Bilal; Pant, M.; Zaheer, H.; Garcia-Hernandez, L.; Abraham, A. Differential Evolution: A review of more than two decades of research. Eng. Appl. Artif. Intell. 2020, 90, 103479. [Google Scholar] [CrossRef]
Brest, J.; Greiner, S.; Boskovic, B.; Mernik, M.; Zumer, V. Self-Adapting Control Parameters in Differential Evolution: A Comparative Study on Numerical Benchmark Problems. IEEE Trans. Evol. Comput. 2006, 10, 646–657. [Google Scholar] [CrossRef]
Gruber, M.; Muick, M. UltraCam Eagle Prime Aerial Sensor Calibration and Validation. In Proceedings of the Imaging and Geospatial Technology Forum (IGTF 2016), Fort Wort, TX, USA, 11–15 April 2016. [Google Scholar]
Huntjr, E.; Rock, B. Detection of changes in leaf water content using Near- and Middle-Infrared reflectances. Remote Sens. Environ. 1989, 30, 43–54. [Google Scholar] [CrossRef]
Hollinger, D.Y.; Ollinger, S.V.; Richardson, A.D.; Meyers, T.P.; Dail, D.B.; Martin, M.E.; Scott, N.A.; Arkebauer, T.J.; Baldocchi, D.D.; Clark, K.L.; et al. Albedo estimates for land surface models and support for a new paradigm based on foliage nitrogen concentration. Glob. Change Biol. 2010, 16, 696–710. [Google Scholar] [CrossRef]
Li, H.; Harvey, J.; Kendall, A. Field measurement of albedo for different land cover materials and effects on thermal performance. Build. Environ. 2013, 59, 536–546. [Google Scholar] [CrossRef]
Katsaros, K.B.; McMurdie, L.A.; Lind, R.J.; DeVault, J.E. Albedo of a water surface, spectral variation, effects of atmospheric transmittance, sun angle and wind speed. J. Geophys. Res. Ocean. 1985, 90, 7313–7321. [Google Scholar] [CrossRef]
Chiodetti, M.; Lindsay, A.; Dupeyrat, P.; Binesti, D.; Lutun, E.; Radouane, K.; Mousel, S. PV bifacial yield simulation with a variable albedo model. In Proceedings of the EU PVSEC 2016, Munich, Germany, 20–24 June 2016. [Google Scholar]
Qin, S.; Li, S.; Yang, K.; Zhang, L.; Cheng, L.; Liu, P.; She, D. A Method for Estimating Surface Albedo and Its Components for Partial Plastic Mulched Croplands. J. Hydrometeorol. 2023, 24, 1069–1086. [Google Scholar] [CrossRef]
Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Loy, C.C. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. In Proceedings of the Computer Vision—ECCV 2018 Workshops, Munich, Germany, 8–14 September 2018; Leal-Taixé, L., Roth, S., Eds.; Springer International Publishing: Cham, Switzerland, 2019; Volume 11133, pp. 63–79. [Google Scholar] [CrossRef]
Zhang, K.; Liang, J.; Van Gool, L.; Timofte, R. Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 4771–4780. [Google Scholar] [CrossRef]
Zhang, C.; Li, X. Land Use and Land Cover Mapping in the Era of Big Data. Land 2022, 11, 1692. [Google Scholar] [CrossRef]

Figure 1. Illustration of the overall workflow for the proposed method, from data preprocessing to DE-based downscaling to achieve final high-resolution isotropic broadband albedo. The data input to the method are denoted in pale yellow color.

Figure 2. Illustration of (a) true orthophoto, (b) result of using SAM in

L = 15

segments, and (c) merging into

K = 5

segments after K-means clustering, with overlaid coarse pixels (in red color) based on a lower-resolution satellite image, where the contribution of each segment is spatially denoted.

Figure 2. Illustration of (a) true orthophoto, (b) result of using SAM in

L = 15

segments, and (c) merging into

K = 5

segments after K-means clustering, with overlaid coarse pixels (in red color) based on a lower-resolution satellite image, where the contribution of each segment is spatially denoted.

Figure 3. Considered input data for the urban part of Maribor town, Slovenia: (a) true orthophoto in the VIS band and (b) in the NIR band. Sentinel-2 inputs include (c) VIS, (d) NIR, and (e,f) two SWIR bands. Additional inputs consist of (g) a cDSM (where the brightness of pixels is correlated with the aboveground height), and (h) estimated isotropic broadband albedo over Sentinel-2 bands.

Figure 4. Zero-shot segmentation of the true orthophoto using the SAM method by considering the VIS and NIR bands, (a) before applying cDSM-based filtering and (b) after application.

Figure 5. Application of K-means over SAM segmentation over high-resolution orthophoto for (a)

K = 10

, (b)

K = 50

, and (c),

K = 100

. Pixels colored by segments’ feature vectors for VIS and NIR bands are shown below in the second and third row, respectively.

Figure 5. Application of K-means over SAM segmentation over high-resolution orthophoto for (a)

K = 10

, (b)

K = 50

, and (c),

K = 100

. Pixels colored by segments’ feature vectors for VIS and NIR bands are shown below in the second and third row, respectively.

Figure 6. Results of the proposed method when considering different DE strategies for

K = 10

(a,d),

K = 50

(b,e), and

K = 100

(c,f), when minimizing

E_{1}

(first column) or

E_{2}

(second column) objective functions.

Figure 6. Results of the proposed method when considering different DE strategies for

K = 10

(a,d),

K = 50

(b,e), and

K = 100

(c,f), when minimizing

E_{1}

(first column) or

E_{2}

(second column) objective functions.

Figure 7. Visualization of high-resolution estimated isotropic broadband albedo based on DE/current-to-best/1/bin strategy for

K = 100

(1st row) and per-pixel absolute difference based on

E_{2}

(2nd row) after

G = 500

(a),

G = 1000

(b), and

G = 5000

(c) generations, respectively.

Figure 7. Visualization of high-resolution estimated isotropic broadband albedo based on DE/current-to-best/1/bin strategy for

K = 100

(1st row) and per-pixel absolute difference based on

E_{2}

(2nd row) after

G = 500

(a),

G = 1000

(b), and

G = 5000

(c) generations, respectively.

Figure 8. Estimation of isotropic broadband albedo using (a) Sentinel-2 imagery; (b) proposed downscaling method on true orthophoto and (c) interpolated per-pixel absolute difference between (a) and (b) results.

Figure 9. Spatial–spectral analysis of downscaled albedo for each per-band contribution from the true orthophoto, as well as the estimated per-pixel absolute differences, namely for (a) R, (b), G, (c), B, and (d) NIR bands.

Figure 10. Validation using manually annotated broadband isotropic albedo values for the characteristic materials within the urban area considered.

Figure 11. Comparison of estimated broadband albedo at 1 m for (a) the proposed method, (b) ESRGAN [56], and (c) BSRGAN [57]. The first row presents the estimated albedo values, while the second row shows the per-pixel absolute differences from ground truth data. For better visual comparison, the absolute differences use the same colorbar scale for all three results.

Figure 12. Application of the proposed method on two characteristically different study areas nearby Maribor; (a) suburban region containing forested areas, and (b) a rural region featuring an airport runway.

Table 1. Spectral bands’ information for satellite images obtained by the Sentinel-2 Multispectral Imager (MSI) sensor [38] and the true ortohophoto image obtained by Ultracam Eagle M3 sensor [49].

Sensor	Band	Spectral Range [μm] [38,49]	Per-Pixel Res. [m]
Sentinel-2 MSI	2 (Blue)	0.451–0.539	10
Sentinel-2 MSI	3 (Green)	0.538–0.585	10
Sentinel-2 MSI	4 (Red)	0.641–0.689	10
Sentinel-2 MSI	8 (NIR)	0.784–0.900	10
Sentinel-2 MSI	11 (SWIR)	1.565–1.655	20
Sentinel-2 MSI	12 (SWIR)	2.100–2.280	20
Ultracam Eagle M3	1 (Blue)	0.400–0.600	0.1
Ultracam Eagle M3	2 (Green)	0.480–0.700	0.1
Ultracam Eagle M3	3 (Red)	0.580–0.720	0.1
Ultracam Eagle M3	4 (NIR)	0.680–1.000	0.1

Table 2. Considered input SAM parameters for segmentation mask generation.

Parameter	Value	Description
pred_iou_thresh_acc	0.85	IoU threshold for mask prediction accuracy.
stability_score_thresh	0.90	Stability score threshold to filter unstable masks.
points_per_batch	64	Number of sampled points per batch for improved precision.
crop_n_layers	1	Number of crop layers for processing smaller segments.
crop_n_points_d._f.	2	Downsampling factor for refining small regions.

Table 3. Summary of the best DE strategies for the different K,

E_{*} = {E_{1}, E_{2}}

-considered choices, with RMSE, MAE, and MAPE metrics for a given objective function.

Table 3. Summary of the best DE strategies for the different K,

E_{*} = {E_{1}, E_{2}}

-considered choices, with RMSE, MAE, and MAPE metrics for a given objective function.

DE Strategy	K	$E_{*}$	G	$E_{1}$ or $E_{2}$	RMSE	MAE	MAPE
DE/best/1/bin	10	$E_{1}$	75	169.83	0.0322	0.0224	21.92%
DE/best/2/bin	50	$E_{1}$	150	139.01	0.0272	0.0185	20.40%
DE/rand/2/bin	100	$E_{1}$	450	131.10	0.0257	0.0174	13.44%
DE/rand/2/bin	10	$E_{2}$	150	168.62	0.0317	0.0226	21.49%
DE/best/1/exp	50	$E_{2}$	1350	138.32	0.0267	0.0184	20.34%
DE/current-to-best/1/bin	100	$E_{2}$	4900	130.86	0.0253	0.0174	13.40%

Table 4. Considered input DE parameters based on the best result from sensitivity analysis.

Parameter	Value	Description
M	100	Population size.
G	5000	Number of generations.
E	$E_{2}$	Used objective function.
K	100	Number of segments.
$B + 1$	5	Number of coefficients per segment (i.e., for each band + bias).
Mutation	Current-to-best-1	The used mutation strategy.
Crossover	Binominal	The used crossover strategy.
F & $C R$	Self-adaptive	Estimated using self-adaptivity (Equations (7) and (8)).
$δ$	$10^{- 5}$	Consecutive difference in $E_{2}$ in 10 generations for convergence.

Table 5. Literature-reported and validated albedo ranges for common urban materials. The values indicate the minimum and maximum albedo observed under varying conditions.

Material	Albedo [min, max]	Source
Asphalt	[0.05, 0.25]	García Mainieri et al. [36]
Clay	[0.10, 0.77]	Di Giuseppe et al. [37]
Metal	[0.07, 0.80]	Di Giuseppe et al. [37]
Trees & shrubs	[0.12, 0.18]	Hollinger et al. [51]
Concrete	[0.17, 0.31]	Li et al. [52]
Water	[0.03, 0.10]	Katsaros et al. [53]
Grass	[0.18, 0.23]	Chiodetti et al. [54]
Soil	[0.15, 0.25]	Qin et al. [55]

Table 6. Per-class comparison of RMSE and MAPE metrics between the proposed method, ESRGAN, and BSRGAN.

Material	Proposed		ESRGAN [56]		BSRGAN [57]
	RMSE	MAPE [%]	RMSE	MAPE [%]	RMSE	MAPE [%]
Asphalt	0.0102	1.3072	0.0176	1.9162	0.0223	2.7396
Clay	0.0070	0.7399	0.0093	1.6922	0.0142	4.0377
Metal	0.0095	0.9034	0.0091	1.1306	0.0116	2.3536
Trees & shrubs	0.0210	9.8227	0.0437	20.3588	0.0487	23.9452
Concrete	0.0276	8.3735	0.0350	13.0763	0.0453	18.3799
Water	0.0077	0.6031	0.0365	23.4378	0.0363	28.4478
Grass	0.0286	7.4963	0.0423	15.3704	0.0531	20.7614
Wood	0.0261	9.4317	0.1048	62.3503	0.1090	65.6096
Soil	0.0277	9.5370	0.0271	9.2620	0.0355	13.5878
All	0.0179	4.0299	0.0285	9.2778	0.0341	12.3104

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lukač, N.; Mongus, D.; Bizjak, M. Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution. Remote Sens. 2025, 17, 1366. https://doi.org/10.3390/rs17081366

AMA Style

Lukač N, Mongus D, Bizjak M. Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution. Remote Sensing. 2025; 17(8):1366. https://doi.org/10.3390/rs17081366

Chicago/Turabian Style

Lukač, Niko, Domen Mongus, and Marko Bizjak. 2025. "Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution" Remote Sensing 17, no. 8: 1366. https://doi.org/10.3390/rs17081366

APA Style

Lukač, N., Mongus, D., & Bizjak, M. (2025). Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution. Remote Sensing, 17(8), 1366. https://doi.org/10.3390/rs17081366

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution

Abstract

1. Introduction

1.1. The State of the Art

1.2. Motivation

1.3. Proposed Novelties

1.4. Paper Structure

2. Methodology

2.1. Data Preprocessing

2.1.1. A: Satellite-Derived Isotropic Broadband Albedo

2.1.2. B: True Orthophoto Segmentation

2.2. Optimization-Based Downscaling

2.2.1. Mutation and Crossover Strategies

2.2.2. Proposed Objective Functions

3. Results

3.1. Data Acquisition and Segmentation

3.2. Sensitivity Analysis of DE Strategies and Downscaling Results

3.3. Validation with Ground Truth Data

3.4. Application to Other Study Areas

4. Discussion

4.1. Limitations

4.2. Shadowing and Anisotropy Effects at 1 m

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI