**4. Discussion**

The literature review that was presented in Section 2.1 discussed the recent advancements within the field of deep learning nowcasting. Still, there has been limited research on short-term prediction of radar products' values. Most of the related work has focused on the precipitation nowcasting problem.

To answer our third research question (RQ3), the proposed *NeXtNow* model was further compared to a convolutional architecture that has been previously proposed in the nowcasting literature for the short-term prediction of radar data and has a goal similar to ours (*XNow* [19]). *XNow* is an Xception-based deep learning model that was trained using radar data that were collected at time *t* − 1 for a specific geographic area for predicting one time step in the future (i.e., predicting the radar data at time *t*).

For an exact comparison between the *NeXtNow* and *XNow* models, *XNow* was evaluated using the methodology that was employed for our evaluation of *NeXtNow*. The experiments were repeated three times using three different training–validation–testing splits and the values for each of the performance metrics that were described in Section 2.3.4 were averaged over the three runs.

Tables 6 and 7 illustrate the results for the regression and classification metrics for the *NeXtNow* and *XNow* models, which were trained using one previous time step (*k* = 1) for predicting one time step in the future using both the NMA and MET datasets. The results for the classification metrics that are shown in Table 7 were evaluated for various values of the threshold *τ*. The means and standard deviations that were computed across the three runs are also shown in the tables. The best values are highlighted.

The comparative results in Tables 6 and 7 highlighted that for both the NMA and MET datasets, *NeXtNow* outperformed *XNow* in most of the evaluation metrics and at most of the considered thresholds. In all of these cases, the standard deviation was lower for *NeXtNow*, which showed that the *NeXtNow* model was more stable than *XNow*. For the NMA dataset, we noted that *NeXtNow* was outperformed by *XNow*, but only in terms of *FAR* at all thresholds. This suggested that *NeXtNow* forecasted a higher number of events than *XNow*, but it erroneously forecasted a slightly higher number of normal weather conditions. For the MET dataset, on the other hand, there were only four cases when *NeXtNow* was only slightly outperformed by *XNow*.

The improvement in the performance of *NeXtNow* with respect to *XNow* was statistically significant, with a significance level of *α* = 0.01, as shown by a one-tailed paired Wilcoxon signed-rank test [43,44]. A *p*-value of less than 0.00001 was obtained, which highlighted the statistical significance of the differences that were observed between the performances of *NeXtNow* and *XNow*, as shown in Tables 6 and 7.

The performance of *NeXtNow* could not be precisely compared to that of other approaches in the literature that focused on the prediction of the values of radar products as the datasets that were used for their evaluation differed from ours (considering the radar products that were employed, i.e., reflectivity, velocity and composite reflectivity in our case) and the learning tasks were not formulated exactly as in this paper. When we were only looking at the magnitude of the performance metrics that have been provided by the literature and we disregarded the datasets that were used, we noted the following: *RMSE* values ranging from 0.97 to 4.7 [22], *CSI* values ranging from 0.36 [21] to 0.81 [31], a *POD* value of 0.61 [21] and a *FAR* value of 0.52. The experimental results that were presented in Section 3.3 revealed that the performance of the *NeXtNow* model for the classification task for predicting one step in the future (for *τ* = 5) compared favorably to the performances of the models in the related work: a maximum *RMSE* of 2.442 for the regression task; *CSI* values of 0.683 and 0.735 (for the two case studies); *POD* values ranging from 0.673 to 0.809; and a maximum *FAR* of 0.134.

**Table 6.** The results for the regression metrics for the *NeXtNow* and *XNow* models, which were trained using one previous time step (*k* = 1) for predicting one time step in the future using both the NMA and MET datasets. The means and standard deviations that were computed across the three experimental runs are shown. The best values for the performance metrics are marked with bold and colored with yellow (for the NMA case study) and with blue (for the MET case study).


**Table 7.** The results for the classification metrics for the *NeXtNow* and *XNow* models, which were trained using one previous time step (*k* = 1) for predicting one time step in the future using both the NMA and MET datasets. The means and standard deviations that were computed across the three experimental runs are shown. The best values for the performance metrics are marked with bold and colored with yellow (for the NMA case study) and with blue (for the MET case study).

