F-Segfomer: A Feature-Selection Approach for Land Resource Management on Unseen Domains
Abstract
:1. Introduction
- The selected features should be well suited for handling segmentation tasks and any auxiliary tasks typically used in established segmentation methods. To achieve this, we use cross-entropy to guide feature extraction toward this objective.
- To avoid extracting redundant features, the feature map should remain as sparse as possible. Furthermore, the learned features should not contain redundancy information that could be used to reconstruct the original input image. The property is enforced through the Deep Variational Information Bottleneck (DVIB) [11], which applies a Kullback–Leibler (KL) divergence loss to restrict feature redundancy, ensuring that only essential information is retained.
- F-Segformer Network with Enhanced Generalization: We introduce a novel F-Segformer network that incorporates a Variational Information Bottleneck (VIB) module to improve the generalization capability of SegFormer. The model is trained and validated on data from one domain and tested on a different domain. Our approach not only enhances performance in the same domain but also yields significantly better results when applied to a new domain.
- Online Hard Example Mining (OHEM) for Faster Learning: Although P-VIP could learn a more generalized model, its random sampling increases the training cost. We address the challenge by integrating OHEM into the training process. Conventionally, OHEM prioritizes learning hard samples, allowing the model to learn more effectively. In our application, we find that the technique enables faster convergence despite the added uncertainty introduced by the VIB module.
2. Relative Works
2.1. AI Method and Remote Sensing Application
2.2. Image Segmentation
2.3. Avoid Overfitting Solutions
- Dropout is a regularization technique that randomly “drops out” or deactivates neurons during each forward pass in the training phase. In each iteration, a subset of neurons is randomly set to zero, preventing them from participating in the forward and backward passes. This strategy forces the network to learn distributed and redundant representations, discouraging individual neurons from becoming overly specialized. By promoting more generalized feature learning, dropout improves the generalization ability of the network, helping to prevent overfitting and improve performance on unseen data.
- Batch normalization [33] is another essential technique for stabilizing and accelerating deep neural network training by normalizing the input of the layer. This method adjusts and scales activations within each layer, reducing sensitivity to variations in data scale and distribution. Batch normalization helps mitigate overfitting by reducing dependency on initial weight values, and it has become a widely used component in modern deep neural network architectures.
- Data augmentation expands the diversity and size of the training dataset by creating modified versions of existing samples. The augmentation improves the generalization of the model by exposing it to a wider range of variations during training. Advanced augmentation methods, such as Mix-up [34] and CutMix [35], have been introduced and are now standard in many state-of-the-art (SoTA) segmentation models, significantly improving robustness by minimizing overfitting.
- Early stopping is another regularization technique that monitors the performance of the model in a validation set during training, stopping training once the performance in the validation set ceases to improve. This approach prevents the model from overfitting by avoiding unnecessary training epochs, ensuring that the model maintains optimal performance without learning noise from the training data.
3. Proposed Method
3.1. Network Architecture
3.2. Object Functions
- is the dimension of the latent features;
- i is the index of a position;
- k is the index of a channel.
3.3. Online Hard Example Mining
- is any loss at the pixel of an image. In our work, the loss is represented by in Equation (7).
- is a pixel in a binary mask M that represents the hardest pixels in an image. This mask is dynamically selected based on the highest values of .
4. Experimental Result
- Section 4.1: This section introduces the datasets used in the experiments. Among them, the LoveDA dataset is a large-scale dataset with a rigorous evaluation system. Performances in the dataset are evaluated using a public evaluation server; hence, they ensure the reliability of the experiment in scientific research. In contrast, the Dalat dataset is a customized dataset used to evaluate the applicability of LoveDA when used for real-world applications.
- Section 4.2: The proposed method consists of two important objective functions. One objective function for classification and one objective function for feature selection. Therefore, a hyperparameter is needed to control the balance between these two losses. The major purpose of this experiment is to select the appropriate hyperparameters for the training process. The convergence curves of the objective function and the evaluation results during the training process will also be provided to evaluate the sensitivity of the feature-selection function.
- Section 4.3: This section compares state-of-the-art segmentation methods in a real-world application context. Consequently, the training will be performed on a large-scale dataset and the model will be tested on a new region that does not appear in the training process. In this experiment, the loveDA dataset serves as a large-scale dataset; and the Dalat dataset serves as the dataset collected in a new region.
- Section 4.4: This section compares our method with unsupervised domain-adaptation (UDA) methods. UDA methods incorporate both target domain data and specialized techniques to transfer knowledge from the source domain during training. Then, UDA serves as an upper bound for comparison with the proposed method. Through the experiment, we aim to prove the generalization of the method in a hard manner.
- Section 4.5: This section investigates the two main components of the proposed method: the VIB model and the OHME sampling process. Ablation studies assess the role of each component during training.
- Section 4.6: This experiment explains common errors and their causes. Section 4.6.1 analyzes the common errors, and Section 4.6.2 analyzes the features to explain the causes of those errors.
4.1. Dataset
4.1.1. LoveDA Dataset
4.1.2. Dalat Dataset
4.2. Hyperparameter Selection
4.3. Compare with SoTA
4.4. Compare to Upper Bound
4.5. Ablation Study
4.6. Qualitative Results
4.6.1. Result-Based Analysis
4.6.2. Feature-Based Analysis
- Confusion Between Similar Land Types: Agricultural land, forest land, and degraded land are often misclassified as one another.
- Unrecognizable Land Types: Due to domain gaps, certain land types do not appear correctly and are instead classified as background.
- The original and ground truth images are represented in subfigures (a) and (e).
- Feature maps with high activation strengths for both the proposed method and the baseline SegFormer model. Feature maps extracted by our method are presented in subfigures (b, c, d) and those obtained from the traditional SegFormer are presented in subfigures (f, g, h).
- (a)
- Case 1
- (b)
- Case 2
- (c)
- Case 3
5. Potential Limitation and Future Works
5.1. Potential Limitation
5.2. Potential Future Works
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
VIB | Variational Information Bottleneck |
PPM | Pyramid Pooling Module |
MiT | Mix Transformer |
UDA | Unsupervised Domain Adaptation |
IoU | Intersection Over Union |
SoTA | State-of-The-Art |
OHEM | Online Hard Example Mining |
Technical Terms
Term | Explanations |
Source domain | The domain where the model is originally trained. Typically the domain has a large dataset, often labeled. |
Target domain | The domain where the model is applied. The domain may have fewer labeled samples or only unlabeled data. |
Domain gap | Domain gap refers to the differences in data distribution between the source domain (where a model is trained) and the target domain (where the model is applied). This gap can lead to a drop in model performance when transitioning from the source to the target domain. |
Un-seen domain | It is a testing dataset that has a domain gap issue. |
Sparse feature | A feature which has many zero values. |
Data augmentation | A technique used to artificially increase the size and diversity of a dataset by applying transformations or modifications to existing data. |
FLOPs | Floating-point operations measure the computational complexity of a model by counting the number of floating-point operations (multiplications, additions, etc.) required to perform a forward pass through the model. |
Plug-and-play (PnP) | A PnP module is a self-contained component that can be easily integrated into an existing system without requiring major modifications. |
Best Model Selection | The model’s best performance is recorded based on the validation metric after a training process. |
Iteration | One update of the model’s parameters using a single batch (a subset) of data. |
References
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Pan, H.; Hong, Y.; Sun, W.; Jia, Y. Deep Dual-Resolution Networks for Real-Time and Accurate Semantic Segmentation of Traffic Scenes. IEEE Trans. Intell. Transp. Syst. 2023, 24, 3448–3460. [Google Scholar] [CrossRef]
- Xu, J.; Xiong, Z.; Bhattacharyya, S. PIDNet: A Real-time Semantic Segmentation Network Inspired from PID Controller. In Proceedings of the Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. arXiv 2016, arXiv:1612.01105. [Google Scholar]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. In Proceedings of the Neural Information Processing Systems (NeurIPS), Virtual Event, 6–14 December 2021. [Google Scholar]
- Cheng, B.; Misra, I.; Schwing, A.; Kirillov, A.; Girdhar, R. Masked-attention Mask Transformer for Universal Image Segmentation. In Proceedings of the 2022 IEEE/CVF Conference On Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 1280–1289. [Google Scholar]
- Xiao, T.; Liu, Y.; Zhou, B.; Jiang, Y.; Sun, J. Unified perceptual parsing for scene understanding. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 418–434. [Google Scholar]
- Wang, J.; Zheng, Z.; Ma, A.; Lu, X.; Zhong, Y. LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation. arXiv 2022, arXiv:2110.08733. [Google Scholar]
- Bui, L.; Vo, V.; Pham, T.; Tong, V.; Nguyen, M. Land Resources Statistics on Satellite Images. In Proceedings of the 2024 7th International Conference On Green Technology and Sustainable Development (GTSD), Ho Chi Minh City, Vietnam, 25–26 July 2024; pp. 247–252. [Google Scholar]
- Hung-Nguyen, M. Patch-Level Feature Selection for Thoracic Disease Classification by Chest X-ray Images Using Information Bottleneck. Bioengineering 2024, 11, 316. [Google Scholar] [CrossRef] [PubMed]
- Alemi, A.; Fischer, I.; Dillon, J.; Murphy, K. Deep Variational Information Bottleneck. International Conference on Learning Representations. Available online: https://openreview.net/forum?id=HyxQzBceg (accessed on 1 July 2024).
- Chen, M.; Zheng, Z.; Yang, Y.; Chua, T. Pipa: Pixel-and patch-wise self-supervised learning for domain adaptative semantic segmentation. In Proceedings of the ACM Multimedia, Lisbon, Portugal, 23–27 October 2023. [Google Scholar]
- Hoyer, L.; Dai, D.; Van Gool, L. Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 220–235. [Google Scholar] [CrossRef]
- Hoyer, L.; Dai, D.; Van Gool, L. HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 372–391. [Google Scholar]
- Hoyer, L.; Dai, D.; Wang, H.; Van Gool, L. MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Shrivastava, A.; Gupta, A.; Girshick, R. Training Region-Based Object Detectors with Online Hard Example Mining. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 761–769. [Google Scholar]
- Demir, I.; Koperski, K.; Lindenbaum, D.; Pang, G.; Huang, J.; Basu, S.; Hughes, F.; Tuia, D.; Raskar, R. DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 172–17209. [Google Scholar]
- Wei, X.; Rao, L.; Fan, G.; Chen, N. MLFMNet: A Multilevel Feature Mining Network for Semantic Segmentation on Aerial Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2024, 17, 16165–16179. [Google Scholar] [CrossRef]
- Yang, Q.; Rao, L.; Fan, G.; Chen, N.; Cheng, S.; Song, X.; Yang, D. WatNet: A high-precision water body extraction method in remote sensing images under complex backgrounds. J. Appl. Remote. Sens. 2024, 18, 11. [Google Scholar] [CrossRef]
- Yu, Y.; Huang, L.; Lu, W.; Guan, H.; Ma, L.; Jin, S.; Yu, C.; Zhang, Y.; Tang, P.; Liu, Z.; et al. WaterHRNet: A multibranch hierarchical attentive network for water body extraction with remote sensing images. Int. J. Appl. Earth Obs. Geoinf. 2022, 115, 103103. [Google Scholar] [CrossRef]
- Vorotyntsev, P.; Gordienko, Y.; Alienin, O.; Rokovyi, O.; Stirenko, S. Satellite Image Segmentation Using Deep Learning for Deforestation Detection. In Proceedings of the 2021 IEEE 3rd Ukraine Conference on Electrical and Computer Engineering (UKRCON), Lviv, Ukraine, 26–28 August 2021; pp. 226–231. [Google Scholar]
- Javed, A.; Kim, T.; Lee, C.; Oh, J.; Han, Y. Deep Learning-Based Detection of Urban Forest Cover Change along with Overall Urban Changes Using Very-High-Resolution Satellite Images. Remote. Sens. 2023, 15, 4285. [Google Scholar] [CrossRef]
- John, D.; Zhang, C. An attention-based U-Net for detecting deforestation within satellite sensor imagery. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102685. [Google Scholar] [CrossRef]
- Ding, Q.; Shao, Z.; Huang, X.; Wang, F.; Wang, M. MLFA-Net: Multi-level feature-aggregated network for semantic change detection in remote sensing images. Int. J. Digit. Earth. 2024, 17, 12. [Google Scholar] [CrossRef]
- Selvaraj, R.; Nagarajan, S. Chapter 6—Change detection techniques for a remote sensing application: An overview. In Cognitive Systems and Signal Processing in Image Processing; Academic Press: London, UK, 2022; pp. 129–143. [Google Scholar]
- Toker, A.; Kondmann, L.; Weber, M.; Eisenberger, M.; Andres, C.; Hu, J.; Hoderlein, A.; Senaras, C.; Davis, T.; Cremers, D.; et al. DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Gerke, M. Use of the Stair Vision Library within the ISPRS 2D Semantic Labeling Benchmark (Vaihingen); University of Twente: Twente, The Netherlands, 2015. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
- Chen, L.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 40, 834–848. [Google Scholar] [CrossRef]
- Csurka, G.; Volpi, R.; Chidlovskii, B. Unsupervised Domain Adaptation for Semantic Image Segmentation: A Comprehensive Survey. arXiv 2021, arXiv:2112.03241. [Google Scholar]
- Kukačka, J.; Golkov, V.; Cremers, D. Regularization for Deep Learning: A Taxonomy. In Proceedings of the 2018 International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
- Arpaci, S.; Varli, S. Semantic Segmentation with the Mixup Data Augmentation Method. In Proceedings of the 2022 30th Signal Processing and Communications Applications Conference (SIU), Safranbolu, Turkey, 18–20 May 2022; pp. 1–4. [Google Scholar]
- Fang, F.; Hoang, N.; Xu, Q.; Lim, J. Data Augmentation Using Corner CutMix and an Auxiliary Self-Supervised Loss. In Proceedings of the 2023 IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 8–11 October 2023; pp. 830–834. [Google Scholar]
- Tolstikhin, I.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. MLP-Mixer: An all-MLP Architecture for Vision. arXiv 2021, arXiv:2105.01601. [Google Scholar]
- Xue, Y.; Zhang, L.; Wang, B.; Li, F. Feature Selection Based on the Kullback–Leibler Distance and its Application on Fault Diagnosis. In Proceedings of the 2019 Seventh International Conference on Advanced Cloud and Big Data (CBD), Suzhou, China, 21–22 September 2019; pp. 246–251. [Google Scholar]
- Li, J.; Yu, Z.; Du, Z.; Zhu, L.; Shen, H. A Comprehensive Survey on Source-Free Domain Adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5743–5762. [Google Scholar] [CrossRef] [PubMed]
- Codalab LoveDA Semantic Segmentation Challenge. Available online: https://codalab.lisn.upsaclay.fr/competitions/424, (accessed on 3 October 2024).
- Gao, K.; Yu, A.; You, X.; Qiu, C.; Liu, B.; Zhang, F. Cross-Domain Multi-Prototypes with Contradictory Structure Learning for Semi-Supervised Domain Adaptation Segmentation of Remote Sensing Images. Remote Sens. 2023, 15, 3398. [Google Scholar] [CrossRef]
- Ding, L.; Tang, H.; Bruzzone, L. LANet: Local Attention Embedding to Improve the Semantic Segmentation of Remote Sensing Images. IEEE Trans. Geosci. Remote. Sens. 2021, 59, 426–435. [Google Scholar] [CrossRef]
- Sun, K.; Xiao, B.; Liu, D.; Wang, J. Deep High-Resolution Representation Learning for Human Pose Estimation. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Sun, X.; Wang, P.; Lu, W.; Zhu, Z.; Lu, X.; He, Q.; Li, J.; Rong, X.; Yang, Z.; Chang, H.; et al. RingMo: A Remote Sensing Foundation Model with Masked Image Modeling. IEEE Trans. Geosci. Remote. Sens. 2023, 61, 1–22. [Google Scholar] [CrossRef]
- Luo, Y.; Zheng, L.; Guan, T.; Yu, J.; Yang, Y. Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 2502–2511. [Google Scholar]
- Mei, K.; Zhu, C.; Zou, J.; Zhang, S. Instance Adaptive Self-Training for Unsupervised Domain Adaptation. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Wu, L.; Lu, M.; Fang, L. Deep Covariance Alignment for Domain Adaptive Remote Sensing Image Segmentation. IEEE Trans. Geosci. Remote. Sens. 2022, 60, 1–11. [Google Scholar] [CrossRef]
- Wang, H.; Shen, T.; Zhang, W.; Duan, L.; Mei, T. Classes Matter: A Fine-Grained Adversarial Approach to Cross-Domain Semantic Segmentation. In Proceedings of the Computer Vision–ECCV 2020, Glasgow, UK, 23–28 August 2020; pp. 642–659. [Google Scholar]
- Wang, J.; Zhong, Y.; Zheng, Z.; Ma, A.; Zhang, L. RSNet: The Search for Remote Sensing Deep Neural Networks in Recognition Tasks. IEEE Trans. Geosci. Remote. Sens. 2021, 59, 2520–2534. [Google Scholar] [CrossRef]
- Lian, Q.; Duan, L.; Lv, F.; Gong, B. Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6757–6766. [Google Scholar]
- Zhao, Q.; Lyu, S.; Zhao, H.; Liu, B.; Chen, L.; Cheng, G. Self-training guided disentangled adaptation for cross-domain remote sensing image semantic segmentation. Int. J. Appl. Earth Obs. Geoinf. 2024, 127, 103646. [Google Scholar] [CrossRef]
State | Output Size | Patch Size | Stride | Channel Number | Number of Blocks |
---|---|---|---|---|---|
State 1 | 7 | 4 | 3 | 64 | 3 |
State 2 | 3 | 2 | 1 | 128 | 6 |
State 3 | 3 | 2 | 1 | 320 | 18 |
State 4 | 3 | 2 | 1 | 512 | 24 |
DDRNet | Unet | Mask-Former | PID | UperNet | Seg-Former | VIB | VIB+ OHEM | |
---|---|---|---|---|---|---|---|---|
Background | 14.45 | 1.75 | 15.03 | 6.46 | 11.87 | 6.46 | 6.67 | 6.31 |
Building | 13.70 | 19.73 | 28.39 | 22.49 | 18.00 | 25.42 | 32.07 | 32.49 |
Road | 11.61 | 2.05 | 18.50 | 14.75 | 21.92 | 5.24 | 19.47 | 19.11 |
Water | 52.63 | 45.93 | 36.27 | 56.57 | 51.25 | 19.54 | 58.68 | 60.29 |
Barren | 2.93 | 16.04 | 19.74 | 7.01 | 5.74 | 19.39 | 14.18 | 17.29 |
Forest | 11.63 | 27.42 | 13.59 | 6.27 | 5.74 | 26.64 | 63.71 | 64.83 |
Agricultural | 45.08 | 11.11 | 63.73 | 24.55 | 64.45 | 36.92 | 63.55 | 59.60 |
Mean | 21.72 | 17.72 | 26.61 | 20.95 | 25.57 | 19.94 | 36.90 | 37.13 |
DDRNet | Unet | Mask-Former | PID | UperNet | Seg-Former | VIB | VIB+ OHEM | |
---|---|---|---|---|---|---|---|---|
Flpos (G) | 71.734 | 815.23 | 293 | 23.733 | 948.12 | 425.02 | 451.12 | 440.26 |
Params (M) | 20.296 | 28.991 | 63 | 7.718 | 64.044 | 83.672 | 85.261 | 84.126 |
Training time (h) | 19.36 | 30.99 | 38.5 | 19.52 | 25.22 | 40.82 | 43.26 | 42.12 |
LANet | PSPNet | DeepLabv3 | HRNet | MAE+UPerNet | Our | |
---|---|---|---|---|---|---|
Background | 43.99 | 51.59 | 50.21 | 50.25 | 51.09 | 42.86 |
Building | 45.77 | 51.32 | 45.21 | 50.23 | 46.12 | 65.83 |
Road | 49.22 | 53.34 | 46.73 | 53.26 | 50.88 | 61.60 |
Water | 64.96 | 71.07 | 67.06 | 73.20 | 74.93 | 69.78 |
Barren | 29.95 | 24.77 | 29.45 | 28.95 | 33.24 | 33.06 |
Forest | 31.91 | 22.29 | 31.42 | 33.07 | 29.89 | 46.03 |
Agricultural | 24.90 | 32.02 | 31.27 | 23.64 | 37.60 | 43.70 |
Mean | 41.53 | 43.77 | 43.05 | 44.66 | 46.25 | 51.84 |
Background | Building | Road | Water | Barren | Forest | Agricultural | Mean | |
---|---|---|---|---|---|---|---|---|
CLAN [44] | 22.93 | 44.78 | 25.99 | 46.81 | 10.54 | 37.21 | 24.45 | 30.39 |
IAST [45] | 29.97 | 49.48 | 28.29 | 64.49 | 2.13 | 33.36 | 61.37 | 38.44 |
FADA [47] | 24.39 | 32.97 | 25.61 | 47.59 | 15.34 | 34.35 | 20.29 | 28.65 |
TransNorm [48] | 19.39 | 36.30 | 22.04 | 36.68 | 14.00 | 40.62 | 3.30 | 24.62 |
PyCDA [49] | 12.36 | 38.11 | 20.45 | 57.16 | 18.32 | 36.71 | 41.90 | 32.14 |
DAFormer [13] | 37.39 | 52.84 | 41.99 | 72.05 | 11.46 | 46.79 | 61.27 | 46.25 |
MIC [15] | 36.20 | 47.84 | 39.23 | 70.05 | 13.27 | 45.52 | 60.74 | 44.89 |
DCA [46] | 36.38 | 55.89 | 40.56 | 62.03 | 22.01 | 38.92 | 60.52 | 45.17 |
DASegNet (DeeplabV3) | 33.79 | 55.95 | 39.69 | 69.28 | 14.19 | 44.79 | 62.16 | 45.69 |
DASegNet (SegFormer) | 36.78 | 59.83 | 43.77 | 73.83 | 19.38 | 49.96 | 67.01 | 50.08 |
VIB | 30.31 | 56.64 | 39.54 | 63.75 | 13.51 | 37.91 | 50.65 | 41.76 |
VIB+OHEM | 31.31 | 54.92 | 43.18 | 63.82 | 22.28 | 38.35 | 53.19 | 43.86 |
Segfomer (drop out = 0.05) | 20.62 | 34.12 | 31.7 | 37.31 | 7.33 | 33.15 | 20.31 | 26.36 |
Segfomer (drop out = 0.1) | 21.61 | 34.14 | 32.71 | 30.24 | 9.53 | 34.53 | 24.15 | 26.70 |
Segfomer (drop out = 0.15) | 18.82 | 33.31 | 33.9 | 45.95 | 4.683 | 33.04 | 14.69 | 26.34 |
Base | SegFormer+VIB | SegFormer+VIB+OHEM | |
---|---|---|---|
LovaDA → Dalat | 90k | 120k | 40k |
Urban → Rural | 96k | 110k | 40k |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nguyen, M.-H.; Vu, C.-C. F-Segfomer: A Feature-Selection Approach for Land Resource Management on Unseen Domains. Sustainability 2025, 17, 2640. https://doi.org/10.3390/su17062640
Nguyen M-H, Vu C-C. F-Segfomer: A Feature-Selection Approach for Land Resource Management on Unseen Domains. Sustainability. 2025; 17(6):2640. https://doi.org/10.3390/su17062640
Chicago/Turabian StyleNguyen, Manh-Hung, and Chi-Cuong Vu. 2025. "F-Segfomer: A Feature-Selection Approach for Land Resource Management on Unseen Domains" Sustainability 17, no. 6: 2640. https://doi.org/10.3390/su17062640
APA StyleNguyen, M.-H., & Vu, C.-C. (2025). F-Segfomer: A Feature-Selection Approach for Land Resource Management on Unseen Domains. Sustainability, 17(6), 2640. https://doi.org/10.3390/su17062640