2.2.1. Algorithm Overview
The workflow of the algorithm is illustrated in
Figure 1.
This study draws upon the algorithmic criteria proposed by Zhuge and Zou [
25] (2018) and Mecikalski and Bedka [
15] (2006) to extract the indicators outlined in
Table 4 from satellite channels.
In the following section, the physical significance of the indicators is outlined.
REF
GHI denotes the normalized reflectance of GHI 250m-resolution data within the tile area. This criterion filters out regions with higher reflectance in cloud imagery. Typically, cumulus clouds exhibit higher reflectance, while stratus and cirrus clouds show lower reflectance. This criterion further identifies convective clouds. Satellite-received reflectance varies significantly throughout the day. Zhao Fengmei [
28] studied cirrus reflectance using MODIS satellite data and found that cirrus reflectance is influenced not only by cloud thickness but also by the solar zenith angle. To extend the usable timeframe of the GHI, reflectance data undergo solar altitude angle correction.
TBB13 is the blackbody brightness temperature at 10.8 μm. Blackbody brightness around 10.8 μm mainly refers to the cloud-top height, and when clouds grow upward, this decreases.
TBB09-TBB13 is the water vapor brightness temperature channel window difference. This is generally used to detect cloud-top height compared with the tropopause. Positive values indicate that the maximum cloud height is above or close to tropopause, and negative values show areas where the convection may start. The criterion increases with the volume of the cloud. Large cumulus clouds usually have a higher value.
TBB14-TBB13, referred to as “split-window”, is often used to measure the optical thickness of the cloud and, in general, to distinguish the cirrus clouds from the deep convection clouds. Areas near zero are cumulus clouds, and areas above zero may be high-cirrus clouds.
Delta TBBGHI mainly describes the cooling rate of cloud tops, and it is the most direct measure of convection initiation. It exploits the high temporal resolution of GHI data, based on obtaining the bright surface temperature in 3 min and 6 min prior to the AGRI scan. Since the time–resolution difference between FY-4B AGRI and GHI products is the same for the same timescale, we transform these data into another 15-min bright semichemic change in the CI, which further characterizes the fast evolution features during the cloud-top ascent of convection initiation. For most cases, values lower than zero indicate cloud ascent, and faster development is associated with higher absolute values.
Delta (TBB09-TBB13) and Delta (TBB14-TBB13) are both calculated by subtracting the current time from the optical flow simulation. These two metrics characterize the temporal variation in the cloud-top height relative to the tropopause and the temporal variation in cloud thickness, respectively. Similarly to Delta TBB13, they serve as criteria for describing cumulus cloud development.
Most convective initiation algorithms treat the cloud-top cooling rate as a key parameter. For instance, the MB06 algorithm assigns a score of 2 to the infrared cooling rate criterion during scoring, while other criteria receive a score of 1. Zhuge opted to exclude the cloud-top cooling rate as a final criterion from algorithmic scoring. In this study, the cloud-top cooling rate serves as a necessary condition. Convective initiation is identified only when the cooling rate threshold is satisfied and four or more of the remaining six criteria are simultaneously fulfilled. Since convective initiation inevitably introduces some spatial error, this study considers the algorithm’s prediction accurate when the distance between the location where the radar combined reflectivity first reaches the 35 dBZ threshold and the satellite-predicted location is less than 10 km.
The following section details the data processing methods employed in this study, along with the logic for determining the algorithm threshold and the minimum number of satisfied instances for the algorithm in the Sichuan region.
2.2.2. Data Processing
With bilinear interpolation, the resolutions of AGRI and GHI data are matched for later steps. Then, a cumulus mask is employed to identify the immature cumulus, which is the main target for CI detection. The cumulus mask is defined using the following criteria: REFGHI > 0.4, TBB13 > 253.15 K, and TBB09-TBB13 > −10 °C. This procedure effectively selects developing cumulus clouds while excluding non-convective cloud types and mature convective systems. Reflectance is primarily used to identify cloud presence; however, cloud-type discrimination is achieved through thermal infrared constraints. In particular, brightness temperature and brightness temperature differences help distinguish optically thin cirrus clouds from deeper convective clouds, as cirrus clouds typically exhibit distinct thermal signatures associated with higher cloud tops and lower optical thickness.
Mature convective clouds are excluded from brightness temperature calculations because they typically have much lower cloud-top temperatures than at the beginning of the process.
Aerosols and dust may enhance visible reflectance; however, they typically do not satisfy the thermal characteristics associated with convective cloud development, such as low brightness temperature and significant cooling signals. Therefore, the combined use of reflectance and thermal criteria effectively reduces false identification caused by aerosols, dust, and optically thin clouds.
Traditional CI detection methods often rely on spatial overlap between cloud regions at consecutive time steps. However, when cloud displacement is significant, such overlap-based approaches may lead to mismatches and erroneous estimation of cloud-top cooling. The Farneback optical flow method is employed to compensate for horizontal cloud motion between successive satellite images. This procedure reduces the number of apparent cloud-top cooling signals induced by advection rather than vertical development. Similar approaches have been shown to improve CI detection performance in previous studies [
29,
30,
31]. Two TBB13 images taken 15 min apart are first converted into grayscale images and compute the Farneback optical flow field. Based on the obtained optical flow information, the brightness temperature field is advected along the optical flow vectors to the current time. This simulates the distribution of the cloud body in that moment under the assumption that it undergoes only horizontal translation, with no internal development or dissipation occurring.
2.2.3. Algorithm Criterion Threshold
To determine the threshold values for the convective initiation criteria, 10 CI cases within the radar coverage of Mianyang Airport were manually selected as development samples. Only cases without significant cloud splitting or merging during their evolution were considered in order to ensure clear identification of the cloud objects associated with CI. Thus, the validation dataset is spatially independent from the development dataset. After obtaining the coordinates of the radar echo target area, a 10 km × 10 km region centered on these coordinates was established. The closest cloud object within this region was designated as the target for convective initiation, and its various criteria were statistically analyzed.
Figure 2 displays box plots of the selected criteria’s distances relative to the convective initiation event time at 60 min, 45 min, 30 min, 15 min, and 0 min prior. The 0 min mark represents the start time of the AGRI scan preceding event occurrence.
This study also adopted the processing method for representative brightness temperatures. Simply put, the average brightness temperature value of the lowest 25% of pixels in the TBB13 within a cloud object was used to represent the brightness temperature of that cloud object.
The nowcast results were evaluated using the POD (probability of detection), FAR (false alarm ratio), and CSI (critical success index). To ensure that the average nowcast lead time was about 30 min, thresholds were selected 30 min prior to CI occurrence. We performed a full evaluation to ensure the exact selection of the algorithm. Also, a sensitivity analysis was conducted to evaluate the impact of threshold selection. We tested the different algorithm sets, from the 10th percentile and 25th percentile to the 50th percentile of several classic criteria, namely REFGHI, TBB13, TBB09-TBB13, TBB14-TBB13, delta TBB09-TBB13, and delta TBB14-TBB13. TBB11-TBB13 is similar to TBB09-TBB13, so it was not used in this section. The AGRI tri-spectral difference performs better when there are fewer cloud types, while when it comes to complex cloud type conditions, it may perform poorly. Therefore, regarding the algorithm that targets Sichuan, which has complex cloud types, this variable was not included in the present analysis.
To evaluate the performance of the algorithm sets, each set’s POD, FAR, CSI, and F1 scores were calculated. The F1 score is the harmonic mean of the accuracy and recall that we can consider to be the accuracy and coverage of the algorithm. It is closer to 1, for which a target can be correctly observed and false positives can be eliminated. PR curves were employed to evaluate the performance for each algorithm set. The results are shown in
Figure 3. The PR curves corresponding to different percentile thresholds exhibit a high degree of overlap, suggesting that the algorithm shows relatively low sensitivity to threshold variations, while the 25th percentile provides the best balance between detection and false alarms, and provides the highest F1 score of 0.765.
In the 50th percentile, accuracy tends to decrease as recall increases because of the high risk of false alarms. The 10th percentile, however, seems to be more restrictive and may miss weaker or early-stage convection.
The coefficient of variation for threshold selection was employed to analyze the sensitivity of the POD, FAR, and CSI in
Figure 4. In regard to score 4, the CSI was low in terms of sensitivity, the POD showed middling sensitivity, and the FAR displayed high sensitivity. The reason why the FAR showed such high sensitivity was that the FP was sensitive to the threshold selection. The CSI remained low in terms of sensitivity, meaning that the algorithm was robust within the studied climatic regime.
Based on the results outlined above, we are able to build upon the algorithm as follows: Delta TBBGHI < −2 K/15 min convection will have occurred in the region if four or more of the six criteria are satisfied. Typically, cumulonimbus clouds will be cloudy. It can be observed that Sichuan typically shows overcast conditions with warm-sector convection in the summer. For that reason, when radar echoes exceed 35 dBZ, TBB13 might still be high at that point in the case of low cooling. The criteria are therefore relaxed to better detect these convection cases.
The proposed CI detection algorithm is based on physically interpretable satellite observations such as visible reflectance, brightness temperature (TBB), brightness temperature differences (BTDs), and cloud-top cooling, all of which are closely related to the microphysical and dynamical effects of convection clouds. The cumulus cloud mask is important to reduce the effects of non-convective clouds such as thin clouds. By combining reflectance thresholds and texture properties, the mask removes actively developing cumulus clouds and can reduce contamination from optically thin cloud layers. Environmental factors such as water vapor inhomogeneity and condensed moisture are not directly described, but they are well estimated in the satellite observations. For instance, the brightness temperature and the variation in its timescale are sensitive to cloud-top height and cooling, which are strongly affected by atmospheric moisture and latent heat releases, and the brightness temperature between infrared channels indirectly relates the cloud optical thickness and microphysical properties. We also compare the results with those of the algorithm without GHI data and the original SATCAST algorithm.