1. Introduction
In the field of atmospheric and meteorological systems, cloud research has become essential. Clouds have a significant impact on the global atmospheric radiation balance as they reflect short-wave radiation and absorb and emit long-wave radiation, which in turn affects the Earth’s temperature [
1,
2]. An accurate and automated cloud classification has been used in many climatic, hydrological, and atmospheric applications. The diversity of cloud types is reflected in the structural patterns observed in clouds, and cloud types have a profound influence on the radiative effects of the Earth’s surface–atmosphere system [
3,
4]. Furthermore, geosynchronous meteorological satellite radiation scanners have become widely used meteorological observation tools, particularly due to the narrow observation range of polar-orbiting satellites and lengthy gaps between observations in the same region. Moreover, their ability to provide all-weather continuous large-scale observation is crucial for the continuous monitoring of cloud changes [
5]. Therefore, developing a fast, precise, and autonomous cloud classification approach based on satellite data is essential [
6].
According to the 2017 updated guidelines for cloud observation from the Earth’s surface published by the World Meteorological Organization (WMO) [
7], clouds can be grouped into three main categories, including low, middle, and high clouds, and into ten genera, such as cumulonimbus (Cb), cumulus (Cu), stratocumulus (Sc), stratus (St), nimbostratus (Ns), altocumulus (Ac), and altostratus (As). However, the accuracy of classification depends on an observer’s expertise and requires them to collaborate and rely on their professional knowledge to identify the correct cloud types based on macrostructure characteristics. Due to the subjective nature of manual classification, the classification results can be impacted. Research on cloud identification and categorization using satellite data has improved alongside advancements in satellite observation technologies [
8,
9]. The International Satellite Cloud Climate Project (ISCCP) has developed the standard for cloud classification based on satellite and ground data. Cloud types can be divided into nine categories based on cloud top pressure and cloud optical thickness [
10,
11]. Powerful satellites such as CloudSat, Cloud-Aerosol Lidar, and Infrared Pathfinder Satellite Observations (CALIPSO) can characterize three-dimensional (latitude, longitude, and altitude) cloud structures [
12,
13]. The cloud profiling radar (CPR) on CloudSat and the Cloud–Aerosol Lidar with Orthogonal Polarization (CALIOP) on CALIPSO can detect cloud particles at various altitudes, providing a vertical overlap structure for cloud layers [
14,
15]. CloudSat’s and CALIPSO’s cloud products have been extensively updated for over a decade and have frequently been used as benchmarks for validation and comparison purposes due to their widely recognized high quality [
16,
17]. However, limited coverage by polar-orbiting satellites in the same area and long intervals between observations make it challenging to observe daily cloud variability. Remote sensing investigations benefit from the superior temporal resolution and wider coverage of geostationary meteorological satellites. Bankert et al. analyzed and compared the explicit and implicit physical algorithm results for cloud type classification using GOES (Geosynchronous Orbiting Environmental Satellite) data [
18]. The Japanese Aerospace Exploration Agency (JAXA) created a cloud classification system based on the ISCCP method for the Himawari-8 geosynchronous satellite, which is highly consistent with MODIS data [
19,
20]. The National Satellite Center of the China Meteorological Administration divides the FY-4A satellite’s cloud classification scheme into six categories: water, cold water, mixed, ice, cirrus, and overlap. However, the CLT’s cloud classification products contain fewer cloud types, and therefore an effective classification algorithm based on the ISCCP cloud classification scheme is urgently required.
Currently, there exist four fundamental cloud categorization techniques, namely, threshold approaches, split-window methods, texture-based methods, and statistical methods. The most popular cloud classification techniques, such as the split-window and threshold approaches, use data on reflectance, bright temperature, brilliant temperature difference, and sub-bedding type to identify the cloud type [
21,
22]. Nevertheless, these approaches may face failure in certain circumstances due to the intricacy of cloud systems, such as solar flare zones and high-latitude deserts, where fluctuations in brightness temperature differences can cause misinterpretations [
23]. Since these approaches consume much information by employing all available bands, traditional mathematical and statistical methods, such as clustering and histogram methods, could be preferable to the threshold methods for cloud classification and detection [
24,
25]. However, applying them to individual clusters with significant overlap can be challenging. Furthermore, texture-based approaches have been evolved to determine the structure of various cloud types, but they fail to exploit long-term continuous observational data [
26,
27]. Additionally, K-means and support vector machine (SVM) approaches have been utilized for cloud classification tasks as a result of technological advancements, and they have obtained remarkable classification outputs [
28,
29]. However, the classification process neglects the cloud’s integrity, which impacts the classification outcome.
Many high-performance image-processing-based techniques have emerged, enabling the handling of massive amounts of data to become simpler and more effective. This development has significantly accelerated the advancement of deep learning. Furthermore, deep learning-based techniques have improved the accuracy and efficiency of cloud categorization tasks significantly. Artificial neural networks (ANNs) and convolutional neural networks (CNNs) have played vital roles in cloud classification techniques. Taravat et al. [
30] achieved remarkable classification results by employing sky camera data and ANNs in their investigation of automatic cloud classification methods. Furthermore, Liu et al. [
31] employed ANN models to classify FY-2C (Fengyun-2C) satellite photos and compared their results with those obtained using principal component analysis (PCA) and support vector machine (SVM). It should be noted that only the information within the current pixel can be considered in the above-mentioned ANN approaches. Zhang et al. [
31] applied CNN models to recognize and categorize clouds in ground-based image data. The CNN models employ convolution to gather feature information in an image’s spatial domain, but they neglect both spectral channel dimension information and pixel-by-pixel cloud categorization. Jiang et al. [
32] proposed an improved network based on U-Net and attained excellent cloud classification results on FY-4A (Fengyun-4A) data. Nevertheless, there are still some deficiencies, including poor network classification performance due to the dispersed and fragmented cloud morphology of the Ci, Ac, and other types, and confusion caused by the similar morphologies of Cu, Sc, and other clouds. Hence, there is still room for further improvement in this field.
To solve the aforementioned problems, a densely connected hybrid convolutional network (DCHCN) for cloud classification is proposed in this paper. Considering the characteristics of the FY-4A satellite data, three-dimensional (3D) and two-dimensional (2D) convolutional layers are used to mine the spatial–spectral features of the data. Combining the advantages of both convolutions, a hybrid convolution structure was constructed to better extract spatial and spectral features. Additionally, dense connections are constructed to create shortcuts between the front and back layers, which facilitates the back-propagation of the gradient during the training process. By connecting the features of channels, the dense connection technique enables the reuse of features and further improves the classification accuracy. The long time series of FY-4A observations are modeled, and the FY-4A L1-level data are classified pixel-by-pixel into eight categories using the ISCCP cloud classification scheme. The cloud classification products obtained from CloudSat are utilized to evaluate the classification outcome.
The remaining sections of this article are organized as follows. The FY-4A data, CloudSat data for inspection, and data preprocessing are introduced in
Section 2. The CNN network, dense connection, and proposed DCHCN methods are discussed in
Section 3. The classification results of the datasets and the analysis are presented in
Section 4. The article is concluded in
Section 5.
4. Result and Analysis
This section analyzes the effects of two main parameters, namely the spatial size of the input block, and the number of spectral channel dimensions of the output feature map of the 3D-CNN, of the proposed DCHCN on its classification performance. The experimental results of the proposed method and several state-of-the-art methods are presented and analyzed. The three metrics OA (overall accuracy), AA (average accuracy) and Kappa (kappa coefficient) are used in this paper to analyze and evaluate the performance of the model. OA indicates the ratio of the number of samples correctly predicted by the model to the total number of samples, AA denotes the average of the model’s prediction accuracy for each category, and Kappa indicates the agreement between the model prediction results and the actual classification results, taking into account the random classification.
4.1. Experimental Configuration
The Windows 10 operating system was used in all tests. The tests were conducted on computers with a 12th Gen Intel(R) Core (TM) i5-12400F CPU and an Nvidia GeForce RTX3060 Ti GPU using PyTorch 1.2.0 deep learning framework and Python 3.9. The number and distribution of training and test dataset samples are presented in
Table 5.
4.2. Analysis of Parameter Effects on Model Performance
4.2.1. Spatial Size Effect
The spatial size of the input block defines the amount of spatial feature information in the input block, and, in CNNs, the effective feature information can significantly affect the classification performance. Therefore, in this study, the spatial size of the input block was analyzed experimentally. In this experiment, the spatial size of the input block was set to {31 × 31, 37 × 37, 43 × 43, 49 × 49, 55 × 55}. The experimental results are presented in
Figure 11a.
As shown in
Figure 11a, the OA value gradually increased when the spatial size increased from 31 to 43, but as the spatial size continued to increase, OA showed a decreasing trend. The increase in spatial size could improve the spatial feature information obtained by the network to a certain extent, but a too-large spatial size could reduce the classification performance of the network. Additionally, a too-large size could lead to the introduction of other types of pixels and even cause noise. Therefore, to achieve the best classification results, a size of {43 × 43} was selected for the input block of the network.
4.2.2. Spectral Dimension of the 3D-CNN Output Feature Map
The input data were processed using the 3D-CNN to extract the spectral–spatial features first, before the data entered the hybrid convolution layer. The output feature map of the 3D-CNN could affect the final feature information obtained using hybrid convolution and thus affect the classification results. Therefore, the effect of the spectral channel dimension of the 3D-CNN output feature map was examined experimentally by changing the 3D-CNN convolution kernel size. In this experiment, the output feature map’s spectral channel dimensions of the 3D-CNN were set to {4,6,8,10,12}. The experimental results are presented in
Figure 11b.
4.3. Results, Analysis, and Comparison of Different Cloud Classification Models
A variety of deep learning networks were selected for the comparison experiments. The proposed DCHCN model was compared with the 2D-CNN, 3D-CNN, HybridSN, UNet, and U2Net models. All models in the comparison experiments were trained and tested on the same training and test datasets. In addition, for all models the training included 120 epochs, and the same parameters, including the optimizer type, loss function, and learning rate, were used in the training process to obtain the final classification models. The eight cloud types are labeled in the following order: Clear sky, Ci, As, Ac, Sc, Cu, Ns, and Dc. The performance of the six models in the cloud classification task was tested separately, and their performances are presented in
Table 5.
As shown in
Table 6, The numbers in bold indicate the models that achieved the best results in their category, the proposed DCHCN model obtained the highest OA, AA, and Kappa values among all models. Although the 2D-CNN, 3D-CNN, HybridSN, UNet, and U2Net models were designed with different network structures to obtain high classification performance, the proposed model benefited from the combination of 2D and 3D convolutions, which resulted in a better classification performance. The proposed method combined spatial and spectral feature information and used dense connections to reuse the features better, and thus could achieve better results at a lower computational cost.
Specifically, the 2D-CNN had the worst classification results among all models for all classes, except for clear sky; i.e., its classification accuracy was below 60% for some of the classes. This model also had the smallest OA, AA, and Kappa values due to the unavailability of spectral feature information for classification. The 3D-CNN model had better classification results compared to the 2D-CNN model, but required more time to obtain these results. The HybridSN model used a hybrid convolutional architecture similar to that of the proposed method. The hybrid convolutional architecture provided high accuracy of 3D convolution while minimizing the number of model parameters. The UNet and U2Net used deeper 2D convolution and could extract feature information better by using the U-shaped structure. However, purely 2D convolution with an increased depth was still not enough to compete with the proposed method.
Moreover, in most cases, the proposed method performed better than the other methods regarding the classification accuracy for all types of clouds. Specifically, the cloud types Ac, Sc, and Dc that had more structural changes could be easily confused with other cloud types, so it was more difficult to classify them accurately. Still, the proposed method performed better than the other methods on these classes. This was mainly due to the fact that the proposed model used dense connections to connect the layers, while the HybridSN model used hybrid convolution. By building a dense connection module, the hidden spectral information, as well as spatial information, could be explored better, and the useful feature information obtained from the other layers could be retained, thus achieving accurate classification of cloud types with insignificant feature information. In addition, compared to the UNet and U2Net models, the proposed method did not use relatively deep convolutional layers and thus did not introduce additional time overhead while improving the classification accuracy; the proposed method had obvious advantages in time performance compared to the other models.
Figure 12 shows the original satellite cloud image, and
Figure 13 depicts the corresponding classification results of each classification method on a dataset. The satellite cloud map data collection time used for the presentation of classification results was 5:45 a.m. on 21 May 2018. At this time, all types of clouds are represented in satellite cloud images.
The classification-result analysis showed that the classification maps obtained using the 2D-CNN and 3D-CNN included more misclassifications than the other models. The HybridSN, UNet, and U2Net classification maps provided clouds that were more fragmented, and they could not distinguish cloud categories Ac, As, Cu, and Ci well. This is because these maps could not use spectral information more accurately than the proposed model. Finally, the classification map obtained using the proposed method could ensure the integrity of cloud clusters, relatively smooth boundaries, high performance, and accurate classification results.
The cloud map data used for the proposed classification model verification corresponded to the spring season, when the probability of single-layer clouds in mainland China is high, especially in the southern region. According to previous macroscopic studies on cloud types in China, the probability of high clouds in the tropic regions is the highest, which is indicated by a large portion of high clouds, such as Ci and As, at the bottom of the classification map. However, in mainland China, the probability of high clouds in the north, especially the northeast and north of China, is higher than that in the south, which is indicated by a large number of high clouds, such as Ci and As, at the top of the classification map. In addition, in the tropical region, deep convective activity is more frequent, and water vapor is more abundant, which results in a higher probability of producing highly unstable deep convective clouds of the Dc type and extendable rain clouds of the Ns type. The probability of medium clouds in the southwest region is higher than 50%, which is mainly demonstrated in the classification chart as Sc and Ac cloud types in this region. However, in the sea in the south, at 20°N, the probability of medium clouds is very low and is thus almost not reflected in the classification chart. Due to the influence of water vapor, large cumulus (Cu) clouds also appear in the classification map. Thus, the classification map obtained using the proposed method is consistent with the findings of macroscopic studies in the Chinese region.
5. Conclusions
This paper proposes a densely connected hybrid convolutional network (DCHCN) for cloud classification tasks using FY-4 satellite data. The proposed DCHCN approach combines 3D and 2D convolutions to integrate spectral and spatial feature information, which effectively improves cloud classification accuracy. Moreover, the proposed approach employs dense connections and fully connected layers to further enhance classification performance. Experimental results demonstrate that the proposed method outperforms other comparative models and attains the highest classification accuracy. These findings confirm the efficiency and advantages of the proposed method in cloud classification using FY-4 satellite data.
However, there are still some aspects that require further improvement. Although the proposed method combines convolutional layers to exploit spatial and spectral features more effectively, it is observed that network performance varies significantly with different spatial scales or spectral dimensions of the parameters in parameter analysis experiments. Furthermore, the optimal parameters found in the experiments may not be applicable to other datasets, indicating the need for further research on the spatial and spectral properties of clouds to improve network robustness. In terms of dataset, the training process is limited by CloudSat’s running orbit and running time, which makes it challenging to incorporate more accurate cloud classification data. Additionally, the inherent nature of satellite data may cause CloudSat’s data to deviate from the real situation. As CloudSat does not provide data labels over a continuous time, the study did not consider the temporal dimension. In addition, the top height of the cloud type St is low, the distribution is scattered and fragmented, and there is a severe problem of insufficient samples. This cloud type was not included in this study. Therefore, in future studies, the dataset will be enhanced by including measured data over a continuous time to improve the accuracy and aid in the analysis. Additionally, a classification scheme for the St type will be added to further improve cloud classification accuracy. Further improvements in the classification accuracy of specific cloud types could eventually lead to overall improvements in the cloud type classification of the proposed model.