MTCSNet: Mean Teachers Cross-Supervision Network for Semi-Supervised Cloud Detection
Abstract
:1. Introduction
- (i)
- The MTCSNet replaces the cross-supervision between student branches with cross-supervision between teacher and student branches, providing more accurate and stable supervision signals to the student branch. Furthermore, strong data augmentation is applied to unlabeled images fed to the student branch, which introduces more prior information and enforces consistency constraints on the prediction results of the same unlabeled image across different batches. Additionally, near-infrared band is used instead of red band as input to assist model training.
- (ii)
- The MTCSNet learns the weights of supervised and semi-supervised losses based on homoscedastic uncertainty for improving model performance. The weights of supervised and semi-supervised losses are dynamically adjusted in every iteration instead of remaining constant. Losses with higher uncertainty will be assigned with smaller weights which results in a smoother and more effective training process. By avoiding fixed loss weights that drag on model performance, the proposed method can better benefit from both supervised and semi-supervised learning tasks. Details are introduced in Section 2.3.
- (iii)
- The effectiveness of the proposed method is verified on two cloud detection datasets. The results show that the proposed method can better utilize unlabeled data for learning and outperforms several state-of-the-art semi-supervised algorithms.
2. The Proposed Methods
2.1. Motivation
2.2. MTCSNet Architecture
2.3. Homoscedastic Uncertainty-Based Loss Weighting (HULW)
2.4. Training Process
3. Results
3.1. Datasets
- SPARCS dataset: The SPARCS dataset is a collection of 80 Landsat-8 satellite images, organized using the WRS2 path/row system, and captured between 2013 and 2014. To ensure a diverse representation of different pixel types, 1000 × 1000 pixel sub-images were manually selected from each image. The spatial distribution of the sub-images is relatively random located. The original annotated images are comprised of seven categories including “cloud”, “cloud shadow”, “shadow over water”, “snow/ice”, “water”, “land”, and “flooded”. The SPARCS dataset contains 11 bands of the imagery data, the detailed information of which can be found in Table 1. To efficiently expand the SPARCS dataset as well as to facilitate training, the original images are cropped into 512 × 512 size subgraphs with adjacent subgraphs overlapping by 12 rows or columns. As well, a random horizontal flip, vertical flip, and random rotational scaling strategy are used to expand the training data during training. We randomly divided them for training, validation, and testing according to the ratio of 7:1:2.
- GF1-WHU dataset: The GF1-WHU dataset is a collection of 108 Level-2A scenes obtained from the GaoFen-1 Wide Field of View (WFV) imaging system. The WFV system has a 16 m spatial resolution and captures four multispectral bands across the visible to near-infrared spectral sections. The scenes were collected from various global land-cover types and under varying cloud conditions. The associated masks in the dataset contain two categories, namely “cloud” and “cloud shadow”. The GF1-WHU dataset contains four bands of data, the detailed information of which can be found in Table 2. We first crop the original image to 512 × 512 size without overlap and remove the edge sub-images with size less than 512 × 512 and the sub-images containing NoValue pixels. Then we filter out the sub-images with cloud pixel content between 5% and 70% (denoted as ). As well, we randomly select a sub-image from and put it into our result dataset, cycling through the above process until the amount of the result dataset is sufficient for 1000 sub-images. Finally, we randomly divided them for training, validation, and testing according to the ratio of 7:1:2.
3.2. Evaluation Metrics and Parameter Setting
3.3. Ablation Experiments
3.4. Comparison Experiments
3.4.1. On the SPARCS Dataset
3.4.2. On the GF1-WHU Dataset
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhang, Y.; Rossow, W.B.; Lacis, A.A.; Oinas, V.; Mishchenko, M. Calculation of radiative fluxes from the surface to top of atmosphere based on ISCCP and other global data sets: Refinements of the radiative transfer model and the input data. J. Geophys. Res. Atmos. 2004, 109, D19105. [Google Scholar] [CrossRef] [Green Version]
- Zhu, Z.; Woodcock, C.E. Continuous change detection and classification of land cover using all available Landsat data. Remote Sens. Environ. 2014, 144, 152–171. [Google Scholar] [CrossRef] [Green Version]
- Yan, J.; Wang, L.; Song, W.; Chen, Y.; Chen, X.; Deng, Z. A time-series classification approach based on change detection for rapid land cover mapping. ISPRS J. Photogramm. Remote Sens. 2019, 158, 249–262. [Google Scholar] [CrossRef]
- Hu, Y.; Dong, Y.; Batunacun. An automatic approach for land-change detection and land updates based on integrated NDVI timing analysis and the CVAPS method with GEE support. ISPRS J. Photogramm. Remote Sens. 2018, 146, 347–359. [Google Scholar] [CrossRef]
- Ma, H.; Liang, S.; Shi, H.; Zhang, Y. An Optimization Approach for Estimating Multiple Land Surface and Atmospheric Variables From the Geostationary Advanced Himawari Imager Top-of-Atmosphere Observations. IEEE Trans. Geosci. Remote Sens. 2021, 59, 2888–2908. [Google Scholar] [CrossRef]
- Zhan, Y.; Wang, J.; Shi, J.; Cheng, G.; Yao, L.; Sun, W. Distinguishing Cloud and Snow in Satellite Images via Deep Convolutional Network. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1785–1789. [Google Scholar] [CrossRef]
- Drönner, J.; Korfhage, N.; Egli, S.; Mühling, M.; Thies, B.; Bendix, J.; Freisleben, B.; Seeger, B. Fast Cloud Segmentation Using Convolutional Neural Networks. Remote Sens. 2018, 10, 1782. [Google Scholar] [CrossRef] [Green Version]
- Wu, X.; Shi, Z. Utilizing Multilevel Features for Cloud Detection on Satellite Imagery. Remote Sens. 2018, 10, 1853. [Google Scholar] [CrossRef] [Green Version]
- Chai, D.; Newsam, S.; Zhang, H.K.; Qiu, Y.; Huang, J. Cloud and cloud shadow detection in Landsat imagery based on deep convolutional neural networks. Remote Sens. Environ. 2019, 225, 307–316. [Google Scholar] [CrossRef]
- Francis, A.; Sidiropoulos, P.; Muller, J.-P. CloudFCN: Accurate and Robust Cloud Detection for Satellite Imagery with Deep Learning. Remote Sens. 2019, 11, 2312. [Google Scholar] [CrossRef] [Green Version]
- Ghassemi, S.; Magli, E. Convolutional Neural Networks for On-Board Cloud Screening. Remote Sens. 2019, 11, 1417. [Google Scholar] [CrossRef] [Green Version]
- Hughes, M.J.; Kennedy, R. High-Quality Cloud Masking of Landsat 8 Imagery Using Convolutional Neural Networks. Remote Sens. 2019, 11, 2591. [Google Scholar] [CrossRef] [Green Version]
- Jeppesen, J.H.; Jacobsen, R.H.; Inceoglu, F.; Toftegaard, T. A cloud detection algorithm for satellite imagery based on deep learning. Remote Sens. Environ. 2019, 229, 247–259. [Google Scholar] [CrossRef]
- Li, Z.; Shen, H.; Cheng, Q.; Liu, Y.; You, S.; He, Z. Deep learning based cloud detection for medium and high resolution remote sensing images of different sensors. ISPRS J. Photogramm. Remote Sens. 2019, 150, 197–212. [Google Scholar] [CrossRef] [Green Version]
- Wieland, M.; Li, Y.; Martinis, S. Multi-sensor cloud and cloud shadow segmentation with a convolutional neural network. Remote Sens. Environ. 2019, 230, 111203. [Google Scholar] [CrossRef]
- Peng, L.; Chen, X.; Chen, J.; Zhao, W.; Cao, X. Understanding the Role of Receptive Field of Convolutional Neural Network for Cloud Detection in Landsat 8 OLI Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5407317. [Google Scholar] [CrossRef]
- Wu, K.; Xu, Z.; Lyu, X.; Ren, P. Cloud detection with boundary nets. ISPRS J. Photogramm. Remote Sens. 2022, 186, 218–231. [Google Scholar] [CrossRef]
- Wu, Z.; Li, J.; Wang, Y.; Hu, Z.; Molinier, M. Self-Attentive Generative Adversarial Network for Cloud Detection in High Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1792–1796. [Google Scholar] [CrossRef]
- Xie, W.; Yang, J.; Li, Y.; Lei, J.; Zhong, J.; Li, J. Discriminative Feature Learning Constrained Unsupervised Network for Cloud Detection in Remote Sensing Imagery. Remote Sens. 2020, 12, 456. [Google Scholar] [CrossRef] [Green Version]
- Guo, J.; Yang, J.; Yue, H.; Tan, H.; Hou, C.; Li, K. CDnetV2: CNN-Based Cloud Detection for Remote Sensing Imagery With Cloud-Snow Coexistence. IEEE Trans. Geosci. Remote Sens. 2021, 59, 700–713. [Google Scholar] [CrossRef]
- Zhang, J.; Wang, H.; Wang, Y.; Zhou, Q.; Li, Y. Deep network based on up and down blocks using wavelet transform and successive multi-scale spatial attention for cloud detection. Remote Sens. Environ. 2021, 261, 112483. [Google Scholar] [CrossRef]
- Zhang, J.; Wu, J.; Wang, H.; Wang, Y.; Li, Y. Cloud Detection Method Using CNN Based on Cascaded Feature Attention and Channel Attention. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4104717. [Google Scholar] [CrossRef]
- Wu, X.; Shi, Z.; Zou, Z. A geographic information-driven method and a new large scale dataset for remote sensing cloud/snow detection. ISPRS J. Photogramm. Remote Sens. 2021, 174, 87–104. [Google Scholar] [CrossRef]
- Chen, Y.; Weng, Q.; Tang, L.; Liu, Q.; Fan, R. An Automatic Cloud Detection Neural Network for High-Resolution Remote Sensing Imagery With Cloud-Snow Coexistence. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6004205. [Google Scholar] [CrossRef]
- Zhang, G.; Gao, X.; Yang, Y.; Wang, M.; Ran, S. Controllably Deep Supervision and Multi-Scale Feature Fusion Network for Cloud and Snow Detection Based on Medium- and High-Resolution Imagery Dataset. Remote Sens. 2021, 13, 4805. [Google Scholar] [CrossRef]
- Guo, J.; Yang, J.; Yue, H.; Li, K. Unsupervised Domain Adaptation for Cloud Detection Based on Grouped Features Alignment and Entropy Minimization. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5603413. [Google Scholar] [CrossRef]
- Guo, J.; Yang, J.; Yue, H.; Chen, Y.; Hou, C.; Li, K. Cloud Detection From Remote Sensing Imagery Based on Domain Translation Network. IEEE Geosci. Remote Sens. Lett. 2021, 19, 5000805. [Google Scholar] [CrossRef]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016. [Google Scholar]
- Zhu, Y.; Zhang, Z.; Wu, C.; Zhang, Z.; He, T.; Zhang, H.; Manmatha, R.; Li, M.; Smola, A.J. Improving Semantic Segmentation via Efficient Self-Training. IEEE Trans. Pattern Anal. Mach. Intell. 2021. [Google Scholar] [CrossRef]
- Ibrahim, M.S.; Vahdat, A.; Ranjbar, M.; Macready, W.G. Semi-supervised semantic image segmentation with self-correcting networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Mendel, R.; De Souza, L.A.; Rauber, D.; Papa, J.P.; Palm, C. Semi-supervised segmentation based on error-correcting supervision. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Nambiar, K.G.; Morgenshtern, V.I.; Hochreuther, P.; Seehaus, T.; Braun, M.H. A Self-Trained Model for Cloud, Shadow and Snow Detection in Sentinel-2 Images of Snow- and Ice-Covered Regions. Remote Sens. 2022, 14, 1825. [Google Scholar] [CrossRef]
- Zhong, Y.; Yuan, B.; Wu, H.; Yuan, Z.; Peng, J.; Wang, Y.X. Pixel Contrastive-Consistent Semi-Supervised Semantic Segmentation. In Proceedings of the 18th IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021. [Google Scholar]
- Alonso, I.; Sabater, A.; Ferstl, D.; Montesano, L.; Murillo, A.C. Semi-Supervised Semantic Segmentation with Pixel-Level Contrastive Learning from a Class-wise Memory Bank. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021. [Google Scholar]
- Lai, X.; Tian, Z.; Jiang, L.; Liu, S.; Zhao, H.; Wang, L.; Jia, J. Semi-supervised Semantic Segmentation with Directional Context-aware Consistency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021. [Google Scholar]
- Zou, Y.; Zhang, Z.; Zhang, H.; Li, C.L.; Bian, X.; Huang, J.B.; Pfister, T. PseudoSeg: Designing Pseudo Labels for Semantic Segmentation. In Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Ouali, Y.; Hudelot, C.; Tami, M. Semi-Supervised Semantic Segmentation with Cross-Consistency Training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 1 March 2020. [Google Scholar]
- Chen, X.; Yuan, Y.; Zeng, G.; Wang, J. Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Guo, J.; Xu, Q.; Zeng, Y.; Liu, Z.; Zhu, X. Semi-Supervised Cloud Detection in Satellite Images by Considering the Domain Shift Problem. Remote Sens. 2022, 14, 2641. [Google Scholar] [CrossRef]
- Kendall, A.; Gal, Y.; Cipolla, R. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Hughes, M.J.; Hayes, D.J. Automated Detection of Cloud and Cloud Shadow in Single-Date Landsat Imagery Using Neural Networks and Spatial Post-Processing. Remote Sens. 2014, 6, 4907. [Google Scholar] [CrossRef] [Green Version]
- Li, Z.; Shen, H.; Li, H.; Xia, G.; Gamba, P.; Zhang, L. Multi-feature combined cloud and cloud shadow detection in GaoFen-1 wide field of view imagery. Remote Sens. Environ. 2017, 191, 342–358. [Google Scholar] [CrossRef] [Green Version]
- Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Spectral Band | Wavelength (μm) | Res. (m) |
---|---|---|
Band 1—Ultra Blue | 0.435–0.451 | 30 |
Band 2—Blue | 0.452–0.512 | 30 |
Band 3—Green | 0.533–0.590 | 30 |
Band 4—Red | 0.636–0.673 | 30 |
Band 5—Near-infrared (Nir) | 0.851–0.879 | 30 |
Band 6—Shortwave Infrared 1 | 1.566–1.651 | 30 |
Band 7—Shortwave Infrared 2 | 2.107–2.294 | 30 |
Band 9—Cirrus | 1.363–1.384 | 30 |
Band 10—Thermal Infrared (TIRS) 1 | 10.60–11.19 | 100 |
Band 11—Thermal Infrared (TIRS) 2 | 11.50–12.51 | 100 |
Spectral Band | Wavelength (μm) | Res. (m) |
---|---|---|
Band 1—Blue | 0.450–0.520 | 16 |
Band 2—Green | 0.520–0.590 | 16 |
Band 3—Red | 0.630–0.690 | 16 |
Band 4—Near-infrared (NIR) | 0.770–0.890 | 16 |
Methods | Metrics | ||||||||
---|---|---|---|---|---|---|---|---|---|
NIR | HULW | MTCS | OA | MIoU | F1 | Precision | Recall | ||
(1) | Supervised | 94.08 | 80.79 | 81.21 | 86.45 | 76.58 | |||
(2) | CPS | 94.77 | 82.34 | 82.81 | 91.87 | 75.38 | |||
(3) | MTCSNet | √ | 95.47 | 84.49 | 85.19 | 93.71 | 78.09 | ||
(4) | MTCSNet | √ | 95.58 | 84.84 | 85.56 | * 93.88 | 78.61 | ||
(5) | MTCSNet | √ | 95.93 | 86.32 | 87.24 | 91.5 | 83.36 | ||
(6) | MTCSNet | √ | √ | 96.12 | 86.89 | 87.83 | 91.98 | 84.04 | |
(7) | MTCSNet | √ | √ | 96.08 | 86.93 | 87.92 | 90.6 | 85.39 | |
(8) | MTCSNet | √ | √ | √ | * 96.41 | * 87.97 | * 88.97 | 91.35 | * 86.72 |
Methods | Label Ratio | Metrics | ||||
---|---|---|---|---|---|---|
OA | MIoU | F1 | Precision | Recall | ||
DeepLabv3+ | 1/1 | 95.97 | 86.66 | 87.65 | 89.62 | 85.77 |
CPS | 1/8 | 94.95 | 82.77 | 83.26 | * 93.12 | 75.29 |
ours | 1/8 | * 96.41 | * 87.97 | * 88.97 | 91.35 | * 86.72 |
Label Ratio | Methods | OA | MIoU | F1 | Precision | Recall |
---|---|---|---|---|---|---|
1/8 | DeepLabv3+ | 94.08 | 80.79 | 81.21 | 86.45 | 76.58 |
MeanTeacher | 93.84 | 79.62 | 79.69 | 88.73 | 72.33 | |
CCT | 94.67 | 82.42 | 83.02 | 88.83 | 77.92 | |
CPS | 94.77 | 82.34 | 82.81 | * 91.87 | 75.38 | |
MTCSNet | * 96.41 | * 87.97 | * 88.97 | 91.35 | * 86.72 | |
1/4 | DeepLabv3+ | 94.67 | 82.58 | 83.25 | 87.65 | 79.27 |
MeanTeacher | 94.81 | 82.53 | 83.04 | 91.46 | 76.05 | |
CCT | 95.27 | 84.28 | 85.08 | 89.87 | 80.78 | |
CPS | 95.50 | 84.84 | 85.63 | 91.91 | 80.15 | |
MTCSNet | * 96.51 | * 88.16 | * 89.13 | * 92.75 | * 85.77 | |
1/2 | DeepLabv3+ | 95.20 | 84.52 | 85.47 | 86.47 | 84.49 |
MeanTeacher | 95.64 | 85.36 | 86.21 | 91.51 | 81.49 | |
CCT | 95.42 | 85.15 | 86.13 | 87.19 | 85.09 | |
CPS | 95.76 | 85.95 | 86.89 | 89.99 | 83.99 | |
MTCSNet | * 96.67 | * 88.77 | * 89.77 | * 92.17 | * 87.49 |
Label Ratio | Methods | OA | MIoU | F1 | Precision | Recall |
---|---|---|---|---|---|---|
1/8 | DeepLabv3+ | 92.37 | 83.69 | 87.50 | 87.03 | 87.98 |
MeanTeacher | 92.49 | 83.92 | 87.70 | 87.26 | 88.14 | |
CCT | 92.72 | 84.44 | 88.19 | 86.90 | * 89.51 | |
CPS | 92.93 | 84.67 | 88.25 | 89.13 | 87.38 | |
MTCSNet | * 93.19 | * 85.18 | * 88.66 | * 89.80 | 87.55 | |
1/4 | DeepLabv3+ | 92.48 | 83.91 | 87.70 | 87.02 | 88.40 |
MeanTeacher | 92.73 | 84.39 | 88.09 | 87.63 | 88.55 | |
CCT | 92.74 | 84.48 | 88.23 | 86.90 | 89.59 | |
CPS | 92.98 | 84.92 | 88.57 | 87.54 | * 89.61 | |
MTCSNet | * 93.38 | * 85.59 | * 89.04 | * 89.59 | 88.49 | |
1/2 | DeepLabv3+ | 93.04 | 84.90 | 88.46 | 88.99 | 87.93 |
MeanTeacher | 92.84 | 84.46 | 88.03 | 89.37 | 86.74 | |
CCT | 93.13 | 85.10 | 88.62 | 89.17 | 88.08 | |
CPS | 93.28 | 85.40 | 88.88 | 89.32 | * 88.44 | |
MTCSNet | * 93.87 | * 86.49 | * 89.71 | * 91.62 | 87.87 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Pan, J.; Zhang, Z.; Wang, M.; Liu, L. MTCSNet: Mean Teachers Cross-Supervision Network for Semi-Supervised Cloud Detection. Remote Sens. 2023, 15, 2040. https://doi.org/10.3390/rs15082040
Li Z, Pan J, Zhang Z, Wang M, Liu L. MTCSNet: Mean Teachers Cross-Supervision Network for Semi-Supervised Cloud Detection. Remote Sensing. 2023; 15(8):2040. https://doi.org/10.3390/rs15082040
Chicago/Turabian StyleLi, Zongrui, Jun Pan, Zhuoer Zhang, Mi Wang, and Likun Liu. 2023. "MTCSNet: Mean Teachers Cross-Supervision Network for Semi-Supervised Cloud Detection" Remote Sensing 15, no. 8: 2040. https://doi.org/10.3390/rs15082040
APA StyleLi, Z., Pan, J., Zhang, Z., Wang, M., & Liu, L. (2023). MTCSNet: Mean Teachers Cross-Supervision Network for Semi-Supervised Cloud Detection. Remote Sensing, 15(8), 2040. https://doi.org/10.3390/rs15082040