1. Introduction
Technologies like remote sensing have revolutionized the way we gather information about the Earth’s surface, giving us the opportunity to monitor and classify land cover and land use in many ways. The availability of hyperspectral, Multispectral, and Synthetic Aperture Radar (SAR) and thermal infrared data has opened new possibilities for urban land cover classification, ecological land mapping, and glacier classification. The utilization of remote sensing data for land cover classification is important in dealing with various environmental and urban planning challenges. This introduction presents an overview of the research studies that have investigated the fusion of different data sources and decision-level techniques to enhance the accuracy and robustness of land cover classification.
Urban areas are dynamic environments, and monitoring land cover changes in these regions is essential for urban planning and development. Qiu et al. [
1] proposed a decision-level fusion approach that leverages multi-seasonal Sentinel-2 imagery with state-of-the-art Residual convolutional neural networks (ResNet) for urban land cover classification. Their approach demonstrates superior performance in urban land cover classification by making use of multiple seasonal datasets. Xu et al. [
2] proposed a novel classification approach based on multi-structure joint decision-making strategy and pretrained convolutional neural networks (CNNs) combining three different CNNs to classify land use. The study illustrates how this approach maximizes the potential of pretrained CNN structures and improves classification accuracy, especially for indistinguishable land use types. Chen et al. [
3] introduced a decision-level fusion approach by combining Landsat 8 and Sentinel-1 data using decision-level fusion (DLF). Their study demonstrates that DLF enhances crop classification accuracy, showcasing the utility of data fusion in agricultural applications. Chen et al. [
4] explored the complementarity of fully polarimetric SAR and optical imaging. Their approach leverages polarimetric decomposition methods and object-based decision tree classification, achieving improved accuracy by merging data from both sources. Land use classification can benefit from supervised cross-fusion methods. Rangzan et al. [
5] presented a supervised cross-fusion method, combining pixel-based and supervised feature-based fusion of thermal, radar, and optical data. Their approach significantly improved classification accuracy compared to state-of-the-art fusion methods, demonstrating the effectiveness of combining multiple data sources. Machine learning classifiers have been proven effective in the hierarchical mapping of glacier surfaces. Alifu et al. [
6] utilized machine learning classifiers, including k-nearest neighbors, support vector machine, gradient boosting, decision tree, random forest, and multi-layer perceptron, to classify debris-covered glaciers. Their approach demonstrated high classification accuracy, particularly when combining multiple data sources, making it suitable for precise delineation of debris-covered glaciers in various regions.
The fusion of hyperspectral and LiDAR data provides valuable insights for surface material classification. Jia et al. [
7] introduced a multiple feature-based super pixel-level decision fusion (MFSuDF) method, combining kernel principal component analysis (KPCA) for dimension reduction and Gabor features for discriminative information. The study presents successful land classification results by combining information from different modules, resulting in an impressive classification accuracy. Fusing data from Sentinel-1 and Sentinel-2 satellites is pivotal for urban land cover mapping. Bui and Mucsi [
8] compared two fusion methods, layer-stacking and Dempster–Shafer (D-S) theory-based approaches, at the decision level. Their results showed that D-S theory-based fusion provided the best mapping accuracy, highlighting the significance of decision-level fusion in enhancing urban land cover mapping. SAR imagery, with its polarization bands, presents unique challenges in crop discrimination. Maggiolo et al. [
9] introduced a decision fusion approach for optical–SAR image classification integrated with Markov Random Fields (MRFs). Their method optimizes classification by integrating spatial-contextual information, making it suitable for large-scale applications like global climate change monitoring. Zhu et al. [
10] proposed a SAR image fusion classification method based on decision-level combination of multi-band information. Their approach incorporates (D-S) evidence theory and convolutional neural networks, achieving improved classification accuracy for multi-band SAR images. Fatemeh Tabib Mahmoudi et al. [
11] introduced a feature-level fusion approach that leverages both SAR texture images and Landsat 8 data. Their method improves the overall classification accuracy and Kappa coefficient, showcasing the potential for feature-level fusion in urban land cover classification.
The primary objective of this study is to combine fully polarimetric SAR data with thermal infrared images in order to examine if a quality bit transmitted along the decisions to the fusion center would significantly improve the classification accuracy. To achieve this, the first step involves registering the acquired images, allowing for proper alignment and calibration. By precisely aligning the images, we ensure a stable spatial reference for later analysis and classification. After the registration process, our efforts will focus on optimal pixel-level feature extraction to capture the distinctive features of land cover types. This feature extraction stage will involve an in-depth analysis of these data, considering the unique information provided by each sensor. Then, we are going to develop a correlated decision with quality bit method to exploit the complementary nature of these modalities, enhancing the discriminative power of the classification process.
In the subsequent sections, we delve deeper into our study, beginning with
Section 2, where we outline the study area and materials utilized. Following this, in
Section 3, we elaborate on the preprocessing techniques employed specifically for PolSAR data.
Section 4 is dedicated to discussing the feature extraction methods applied to PolSAR data, while in
Section 5, we shift our focus to feature extraction techniques we used for Landsat-8 thermal infrared imagery. Our classification methodology is explained in
Section 6, followed by a presentation of decision fusion technique that we used in
Section 7. Finally, in
Section 8, we present our conclusions that came up from the findings of this study.
2. Study Area and Materials
A city in western Ukraine called Lviv was chosen as the study area that is located by north latitude of 49°51′ and east longitude of 24°01′. The study area consists of four main types of land cover including urban, vegetation, water and bare land. The location of the study area is depicted in
Figure 1.
In our study, we used two thermal infrared bands with 100 m spatial resolution of Landsat 8 which belong to Landsat-8 OLI/TIRS-L1TP, a precision terrain product. This product includes radiometric and geometric accuracy and was acquired on 30 March 2014, at 9:14 a.m. Landsat-8 has 185 Km swath width and due to near-polar orbit, degrees inclination, falls within the view once in every 16 days. LST values were calculated using RED, NIR, and 10 and 11 bands of Landsat 8 data.
Also, we used data from satellite ALOS which has absolute orbit 20,165 and has incidence angle (near–far)
and
degrees, respectively. ALOS PALSAR P1.1 Single Look Complex (SLC) product was acquired on 5 November 2009, with L band as center frequency, PLR beam mode, and 30 m spatial resolution. VV, VH, HV, and HH polarizations were used in our study in order to exploit as much information as possible. Images of Landsat 8 and ALOS PALSAR were freely downloaded from the European Space Agency or ESA (
http://earth.esa.int, accessed on 1 September 2022) and Alaska Satellite facility data search (
https://search.asf.alaska.edu, accessed on 1 September 2022), respectively.
3. Preprocessing—PolSAR
SLC PolSAR data represent raw observations (
Figure 2a) that require careful preprocessing to unlock their valuable information. Employing the Sentinel Application Platform (SNAP), we engage in a methodical progression encompassing radiometric calibration [
12], Pauli decomposition, and geometric Doppler terrain correction [
13].
Radiometric calibration serves the pivotal purpose of translating raw digital numbers into physically meaningful units. This procedure corrects a SAR image so that the pixel values truly represent the radar backscatter of the reflecting surface but continues to maintain the geometric distortions in
Figure 2b. Then, Pauli’s decomposition is used in order to transform the complex polarimetric matrices into three distinct Pauli components (see one of the components
Figure 2c). This step enables a visually intuitive representation of polarimetric information, facilitating the interpretation of scattering mechanisms within the radar data. Finally, geometric Doppler terrain correction emerges as a cartographic imperative, rectifying geometric distortions attributable to variable topography. Leveraging a Digital Elevation Model (DEM) [
14], this correction compensates for undulating terrain, aligning radar reflections with veracious geographic coordinates. The output is a georeferenced dataset (
Figure 2d), pivotal for spatially accurate analysis and scientific interpretation.
4. Feature Extraction—PolSAR Data
The basic idea of the Pauli decomposition is to express the matrix
as the sum of elementary scattering matrices representing certain types of deterministic scattering mechanisms [
15,
16,
17]. If we consider that the conventional orthogonal linear
basis and
the Pauli basis
is given by the following three 2 × 2 matrices:
Consequently, given a measured scattering matrix
, it can be expressed as follows:
where
The matrix
is the scattering matrix of a sphere, a plate, or a trihedral. In this way, the intensity of the coefficient
determines the power scattered by targets characterized by single- or odd-bounce. The second matrix,
, is the scattering mechanism of dihedral oriented at 0 degrees; consequently,
represents the power scattered by this type of targets. Finally, the third matrix
is the scattering mechanism of a diplane oriented at 45 degrees, i.e., the coefficient
is referred to those scatterers which are able to return to the orthogonal polarization, from which, one of the best examples is the volume scattering. All this correspondence is demonstrated in
Table 1.
Pauli color-coded representation
The polarimetric information of the scattering matrix can be represented by the combination of intensities (
) in a single RGB image. However, the main drawback is the physical interpretation of the resulting image in terms of
. Consequently, an RGB image can be formed with the intensities
, which corresponds to clear physical scattering mechanisms as shown in
Table 1. The most employed codification corresponds to the following:
Incorporating the theoretical foundation elucidated earlier, we employed Pauli scattering components retrieved from SNAP software (SNAP v9.0.0), denoted as
α,
β, and
γ, representing the intensities of the scattering coefficients. These values were expressed in decibels. Given that negative decibel values are incompatible with color representation, a normalization procedure was implemented for each component. This involved manipulating their histograms to scale the values within the range of 0 to 255. So, we came to the following result as shown below in
Figure 3.
5. Feature Extraction—Landsat-8 Thermal Infrared
Streamlining the identification of specific regions [
19] is facilitated by leveraging previously recorded and registered surface temperature data across diverse land cover types. This meticulous characterization enables the determination of the most likely land cover class and an evaluation of its practical significance. In the estimation of land surface temperature (
LST), we utilized raw data from a single-date Landsat-8 remote sensing imagery [
20,
21,
22], specifically incorporating band 10 and 11, which are significantly influenced by stray light, for specific procedures. Various methods are available for estimating and calculating
LST [
23,
24,
25], including Split-Window (SW), Dual-Angle (DA), and Single-Channel (SC) algorithms. In this research, we opted for the Split-Window approach, integrating thermal band 10 [
26] and Normalized Difference Vegetation Index (
NDVI) data obtained for the study area.
The heat map of
Figure 4 resulted from the equation for calculating
LST which is articulated as follows:
where
denotes brightness temperature,
represents the wavelength of the band,
signifies land surface emissivity, and
is equivalent to 1438. Brightness temperature refers to the temperature of a blackbody corresponding to the radiance detected by a sensor. According to NASA (2012) [
27], it is the temperature measured by the satellite at the moment the image was captured and does not directly mirror the actual temperature of the bare land. Instead, it mirrors the temperature at the satellite location [
28,
29]. To transform the thermal infrared sensor (TIRS) bands data from spectral radiance to brightness temperature, the thermal constants provided in the metadata file are utilized. The equation employed for this conversion is recognized as the brightness temperature (10).
where
and
denote the thermal zone-specific thermal constants using only zone 10. Landsat-8 offers basic constants, including thermal constants and rescaling factors, for
LST estimation, which are all available in the metadata file of Landsat satellite images.
represents the Top of Atmospheric spectral radiance (TOA). To determine the brightness temperature (
BT), Equation (11) is applied, using the Top of Atmospheric spectral radiance (TOA).
The ML factor, denoted by the variable (radiance_mult_band_10), signifies the band-specific multiplicative rescaling factor, while the factor, represented by (radiance_add_band_10), signifies the band-specific additive rescaling factor for the of band 10 image.
The Normalized Difference Vegetation Index (
NDVI), derived from satellite data, is intricately connected to drought conditions. The assessment of green density on a land patch involves observing the distinct colors (wavelengths) of visible and near-infrared sunlight reflected by plants, with band 4 and band 5 (red and near-infrared bands, respectively) utilized for calculating the Normal
NDVI. Estimating
NDVI is crucial due to its correlation with vegetation abundance and providing insights into the general vegetation conditions. Subsequent to
NDVI calculation, the proportion of vegetation (
PV) needs assessment, which is closely linked with
NDVI, and emissivity (ε) calculation becomes pivotal, as emissivity is related to
PV.
To calculate the blackbody radiation resulting from the bare land surface temperature, the Earth’s surface emissivity is used. Several approaches are used to estimate the emissivity at the Earth’s surface, Equation (13) being one of them. One method incorporates NDVI, taking into account the vegetation ratio () to determine the Earth’s surface temperature in Celsius.
For this purpose, the equation for calculating the emissivity (
ε) of the Earth’s surface is used, as presented in Equation (13).
where
= land surface emissivity,
= soil emissivity,
= vegetation emissivity,
= proportion of vegetation, and
= surface roughness taken as a constant value of 0.009.
7. Decision Fusion
As previously mentioned, we utilized the aforementioned windows as training data to establish value ranges for categorizing random pixels into water, urban, forest, and bare land categories. Our approach involved employing majority voting to determine the predominant class within a region (
Table 4).
In the evaluation of our method, we randomly tested 576 pixels, with 144 pixels sampled from each area. The results revealed 396 correct classifications, translating to a percentage accuracy of 68.8%. Specifically, our method achieved 94% accuracy in classifying urban areas, 75% in water types, 56% in forest types, and encountered the highest error rate of 50% in accurately classifying bare land areas.
Although the results achieved through majority voting were satisfactory, we hypothesized that incorporating a quality bit could enhance the decision-making process. This addition not only helped in clarifying cases where uncertainty existed in the decision, such as instances where an equal number of votes were received for multiple land cover types, but also served to reinforce clear decisions. After implementing the quality bit, our assumptions were validated. We observed a substantial improvement in overall accuracy, with an increase of 10.2%. Notably, accuracy improvements were evident in 4 out of the 4 land cover types, specifically a 1% increase in urban types, a 2% improvement in water types, a 19% enhancement in forest types, and a 19% boost in bare land types—addressing significant weaknesses in our coverage classification. Taking into consideration the results, we created a colormap of the study are with the classified pixels (
Figure 7).
Discussion on Fusion Results
Going through this process of classification and decision fusion, in addition to the knowledge we gained, we also understood the pieces that we could investigate in the future to reach a more satisfactory accuracy. The kind of decomposition that could be used in a future publication should achieve better separation in the scattering coefficients. Additionally, the discovery and implementation of more characteristics among the pixels will help us better distinguish areas of mixed land cover, and a lot of that will contribute to our goal.
Looking ahead, the field presents challenges and opportunities. Addressing the open challenges in this research domain, such as refining the integration of quality information for decision fusion, remains a key focus. The dynamic nature of land cover, environmental changes, and the ever-increasing volume of data pose ongoing challenges that require the continuous adaptation of our methodologies.
On the flip side, these challenges also bring forth opportunities for innovation. Advances in machine learning algorithms, sensor technologies, and computational capabilities open doors for more sophisticated and accurate classification methods. Exploring synergies with emerging technologies like remote sensing and artificial intelligence could unlock new possibilities for enhanced land cover analysis.
In summary, the future prospect for this research domain involves navigating challenges while seizing opportunities for advancements. Continuous exploration, adaptation, and the integration of cutting-edge technologies will play pivotal roles in shaping the trajectory of our research and contributing to a more comprehensive understanding of land cover dynamics.
8. Conclusions
In this research, we proposed a novel approach for land cover classification by integrating fully polarimetric Synthetic Aperture Radar (SAR) and thermal infrared data. Our methodology involves a pixel-level correlated decision fusion, which enhances the accuracy and robustness of land cover classification. Ιn reference [
31], a multitude of decomposition methods are analyzed and presented that have been used to extract the biophysical scattering behavior of SAR data. However, in this work we utilized Pauli’s decomposition components and land surface temperature (
LST) as features to extract local decisions for each pixel, considering the unique information provided by each sensor.
Our study area, located in Lviv, Ukraine, consists of four main land cover types: urban, vegetation, water, and bare land. We employed data from Landsat 8 and ALOS satellites, combining thermal infrared and fully polarimetric SAR data to achieve a better understanding of land cover characteristics. The preprocessing steps involved radiometric calibration, Pauli’s decomposition, and geometric Doppler terrain correction for SAR data. Feature extraction included the calculation of scattering coefficients from fully polarimetric SAR data and the estimation of land surface temperature (LST) from Landsat thermal infrared data.
To assess the correlation, we analyzed truth tables such as
Table 3 and their covariance matrices, we selected four random pixels from each land cover type, and we formed two pairs: one comprising two successful classifications and another with one successful and one failed classification. The truth tables, such as those illustrated in
Table 5, were constructed to represent the pixel decisions for each pair. The tables include values of 0 or 1 corresponding to specific features. We used these truth tables later to calculate covariance matrices for each pair, revealing the relationships between pixel decisions. The results, presented in the final covariance matrix table (
Table 6), demonstrate the covariance values between pairs of pixel decisions for water, urban, forest, and bare land. The positive and negative values in the matrices indicate the strength and direction [
32] of the correlation between the selected pixel classifications. We observed that most decisions have a strong correlation without being important if we compare two successful or one successful and one failed pixel classification.
This extensive approach provides a comprehensive insight into the interdependence of pixel decisions across various land cover types, offering valuable information for understanding the performance and reliability of the classification model.
Furthermore, the classification and decision fusion process utilized a window-based training dataset for each land cover type. The highest accuracy was accomplished in urban areas (94%), followed by water (75%), forest (56%), and bare land (50%). Decision fusion was achieved through majority voting, and the method demonstrated an overall accuracy of 68.8%. In conclusion, our proposed methodology of correlated decision fusion accompanied with a quality bit was proved to be effective in enhancing the accuracy of land cover classification by 10.2%. The integration of fully polarimetric SAR and thermal infrared data provides complementary information, and the pixel-level fusion approach ensures a comprehensive understanding of diverse land cover types in the study area.