1. Introduction
This work aims to exploit recent advances in deep learning classification technologies to develop a convective/stratiform (C/S) classification algorithm specific to the WIVERN (Wind Velocity Radar Nephoscope) mission. The mission’s conical scanning Doppler radar provides a 2D curtain of reflectivity and Doppler velocity data, as well as 1D-collocated measurements of brightness temperatures. The goal is to identify pixels that are convective, i.e., those with vertical velocities exceeding 1 m/s, with downdrafts and updrafts combined into a single probabilistic score. Our main contribution is the first deep learning framework trained on end-to-end WIVERN simulations that fuses Doppler velocity, reflectivity, and brightness temperature measurements within a U-Net adapted to the conically scanned W-band geometry and regresses a continuous, physically interpretable convective–stratiform index suitable for direct ingestion by the mission’s Level-2 retrieval chain. The WIVERN mission concept, one of the two candidate missions competing for selection for the Earth Explorer 11 mission within the European Space Agency’s FutureEO programme, promises to revolutionize the study of clouds, with its 800 km swath fast conically scanning 94 GHz Doppler radar at an incidence angle of about 42 degrees [
1,
2,
3]. This configuration allows WIVERN to measure in-cloud winds at the native horizontal resolution of 1 km along track, with approximately 600 m vertical resolution. WIVERN Doppler velocity measurements provide information on the motion of the cloud and precipitation particles along the line of sight (LoS). Because WIVERN observes the atmosphere at a slant incidence angle, the Doppler signal represents a combination of both horizontal and vertical air motions, as well as the hydrometeors’ sedimentation velocity. In regions where vertical motions are negligible and the hydrometeor fall speed can be accurately estimated, it is possible to retrieve the horizontal wind component projected along the horizontal line of sight (
). This information can be used, for example, in data assimilation systems to improve numerical weather prediction [
4,
5]. To derive this wind product, it is essential to distinguish between atmospheric regimes where the vertical velocity (
w) can be considered negligible (defined as
m/s), referred to as stratiform, and those where
w is significant, known as convective. In stratiform regions,
can be directly derived under the assumption that vertical motion is minimal. Conversely, in convective regions, if
can be reconstructed from nearby stratiform regions, it may then be possible to estimate the vertical wind component. These unique in-cloud wind products will then further provide the following:
Full vector wind estimates within clouds over the 800 km swath, by combining the forward and backward radar looks, offering an unparalleled perspective on cloud dynamics [
6,
7].
Insights into convective organisation and anvils morphology, by combining the LoS winds with radar reflectivity to derive convective mass fluxes and assess radiative impacts [
3].
Advanced understanding of the processes governing the formation, organisation, and intensification of mesoscale convective systems, tropical cyclones, and mid-latitude windstorms [
8].
The convective/stratiform (C/S) classification is also of general interest for scientific purposes as stratiform and convective regimes differ in two fundamental ways, as follows:
- 1.
Formation mechanisms: convective precipitation is associated with strong vertical motions and the growth of hydrometeors via coalescence and/or riming, whereas stratiform precipitation occurs in regions with much weaker vertical motion, dominated by vapour deposition and aggregation.
- 2.
The distinct microphysical processes associated with each regime result in differing diabatic heating structures, which, in turn, influence large-scale atmospheric circulation in different ways [
9].
Over the past 30 years, many methodologies have been developed to separate convective and stratiform regimes. Drawing on data from missions such as TRMM, GPM, CloudSat, and EarthCARE (for details on atmospheric radars, see Battaglia et al. [
10]), the scientific community has gained considerable experience in classifying deep hydrometeor layers as either stratiform or convective using only the reflectivity measured by spaceborne radars, typically through echo–object classification schemes. For low-frequency radars (such as the Ku-band radar on board the TRMM and GPM observatories), the fundamental concept is that, in stratiform conditions, there is a smooth transition between the solid to the liquid phase occurring at the freezing level, which is marked by a bright band in the radar reflectivity [
11]. In contrast, convective profiles are characterized by high reflectivities often exceeding 50 dBZ, and extend to altitudes above 10 km without any evidence of a transition region at the freezing level [
12].
Radars with WIVERN-adopted frequency (94 GHz) are subject to return signal saturation due to non-Rayleigh effects (a 94-GHz rarely detects echoes above +20 dBZ [
13], significant signal attenuation [
14], and multiple scattering effects that can distort the signal [
15]. Despite these challenges, several studies use the CloudSat CPR reflectivity profile features near the cloud top to identify convective cores [
16,
17]. The underlying rationale is that the overshooting of high radar reflectivities is an indicator of the larger-size particles pushed high up only possible with the presence of strong rising updrafts. A key limitation of the spaceborne C/S classifications proposed so far is that they are mainly based on vertical profiles of reflectivities. These approaches generally lack direct dynamical information such as Doppler velocities, and make limited use of the spatial texture of the reflectivity field.
In addition to spaceborne approaches, substantial efforts have been made to classify convective and stratiform (C/S) regions using ground-based radar observations. Early attempts relied on rule-based heuristics, using radar reflectivity thresholds and pattern recognition techniques. The Steiner–Houze–Yuter (SHY95) algorithm identifies convective cores based on peak reflectivity and local neighbourhood contrast [
18], while later fuzzy-logic methods extend the idea to three-dimensional volumes [
19]. These methods are simple and fast, but struggle with bright-band artifacts and varying radar geometries.
In recent years, supervised models have gradually replaced fine-tuned threshold methods. Ref. [
20] trained a
k-nearest-neighbour classifier on WSR-98D Doppler fields and achieved a 10–15% skill boost over SHY95. Neural networks are now frequently used in connection to C/S classification from geostationary observations. For example, ref. [
21] built a convolutional neural network (CNN) that detects overshooting tops linked to severe convection and is able to discriminate between intense and ordinary convection. Ref. [
22] showed that gradient-boosted trees fed with spectral visible and infrared data outperform traditional texture metrics for convective region detection.
For passive microwave observations, ref. [
23] showed that a suite of machine-learning models trained on GPM Microwave Imager (GMI) brightness temperatures can already separate convective, stratiform, and mixed precipitation with 90–94% global accuracy, while [
24] applied a Bayesian ResNet to GPM–GMI microwave brightness temperatures, achieving >90% accuracy in distinguishing convective and stratiform precipitation while also providing per-pixel uncertainty estimates, underscoring the value of data-driven approaches for precipitation-type retrievals.
The U-Net encoder–decoder backbone has become the de facto standard for pixel-level classification tasks. For example, Hoeller et al. [
25] applied a vanilla U-Net to identify convective cold pools, while Han et al. [
26] reported similar improvements when using a U-Net-based nowcasting model to forecast 30-min radar precipitation. More recently, Zhang and He [
27] proposed an ensemble of lightweight U-Nets for processing FY-4B geostationary satellite imagery, achieving inference latencies below 100 ms per frame while maintaining a probability of detection (POD) greater than 0.70.
Beyond convective/stratiform (C/S) discrimination, the U-Net architecture has been successfully adapted for a variety of geophysical classification tasks, including cloud typing, land-cover mapping, and severe weather prediction. For instance, the 1D-CloudNet, a one-dimensional nested U-Net, combines Himawari-8 radiance data with CloudSat-derived labels to classify nine cloud categories at the nadir [
28].
Hybrid encoder designs have also enhanced the accuracy of land-cover segmentation in multispectral imagery [
29], while multi-feature fusion U-Nets have improved overall classification performance across diverse surface imaging scenes [
30].
Remarkably, U-Net variants have even been applied to generate spatiotemporal tornado risk maps, by leveraging multivariate fields from numerical weather prediction (NWP) models [
31]. A U-Net backbone has also been adopted to retrieve tropical cyclone inner-core wind fields from combined microwave and infrared imagery, achieving aircraft-like skill in reconstructing inner-core winds [
32]. Additionally, Cao et al. [
33] proposed
Nowcastformer, a Transformer-augmented U-Net that exploits multi-resolution radar and satellite inputs to enhance precipitation nowcasting, while Zhang et al. [
34] integrated residual and attention mechanisms into a lightweight U-Net to deliver real-time nowcasts.
2. Simulations of WIVERN Observables
The WIVERN instrument, with its conically scanning wide-swath radar (
Figure 1a), represents a major technological innovation, unifying the following three advanced satellite sensing capabilities into a single system: range-resolved Doppler velocity, reflectivity measurements, and passive microwave observations. These are integrated through a unique radar–radiometer concept, enabling co-located active and passive measurements to maximize scientific synergy.
During ESA Phase-0 and Phase-A studies an instrument simulator has been developed that simulates all three WIVERN observables from both atmospheric and surface targets based on successive refinements [
2,
35,
36] of the backbone simulator proposed in [
37]. In brief, the simulator takes output from cloud resolving models that provides 3D fields of winds, hydrometeors, temperature, and water vapor and translate them in 94 GHz stimuli (i.e., scattering properties such as 94 GHz extinction, scattering and backscattering coefficients, single scattering albedo and asymmetry parameters). Then, each scene is illuminated by the WIVERN antenna and scanning pattern for any given orbit. The radar observables are simulated accounting for the sampling rate, the sensitivity, and the specific pulse scheme of the instrument (details in [
2]).
WIVERN’s most innovative measurement will be the LoS Doppler velocity (). Because of the slant angle of observation this quantity will be affected by:
- 1.
The horizontal line of sight (HLoS) wind velocity (), i.e., the horizontal wind along the horizontally-projected LoS direction;
- 2.
The vertical wind velocity, w;
- 3.
The radar reflectivity weighted terminal velocity of the hydrometeors (
) [
38].
While the latter contribution can be generally estimated based on the temperature and strength of the radar backscattered signal, the first two contributions are generally entangled. The WIVERN fundamental equation linking the line of sight (LoS) Doppler velocity (
) with the other three variables is given by
and is illustrated in
Figure 1b.
A Dataset of Tropical Cyclone Simulations
The training data used in this study were generated using the WIVERN end-to-end simulator. Simulations were driven by atmospheric conditions derived from a mesoscale numerical weather prediction system. Specifically, data from a WRF (Weather Research and Forecasting) model simulation of Hurricane Milton were used to create the dataset employed in this study. These data include the complete tridimensional thermodynamic state of the atmosphere (profiles of temperature, pressure, water vapor, and winds) and the tridimensional structure of the different hydrometeor (snow, rain, graupel, hail, cloud, and ice) mass contents. These variables are converted into radar stimuli (backscattering coefficient, extinction coefficient, and asymmetry parameter) via 94 GHz scattering pre-computed look-up tables (details in [
37]). The WRF dataset spans the period from 6 October 2024 at 10:00 UTC to 8 October 2024 at 00:00 UTC, with output intervals of one hour (39 h in total).
During this time, the cyclone evolved from a tropical storm to a Category 5 hurricane. Each hourly snapshot captures a domain of approximately 1250 × 1250 × 20 km
3 centred around the cyclone eye, with a horizontal resolution of roughly 1.5 km and vertical resolution of approximately 500 m (but finer at heights lower than 3 km). For each snapshot, 200 s simulations (equivalent to 40 full antenna rotations) were performed. The time domain was centred around the moment when the satellite’s ground track passed closest to the hurricane eye. For each of the snapshots, overpasses were placed by translating an ascending ground track passing exactly over the eye from −6 to +6 deg in longitude in steps of 1 degree (13 tracks in total).
Figure 2 show a representation of the WIVERN sampling strategy in Hurricane Milton. The dataset was obtained by randomly running several combinations of the 13 possible tracks for each of the 39 h of Hurricane Milton data. The final dataset was produced by randomly selecting various combinations of these 13 tracks across the 39 time steps, yielding approximately 80 simulation runs in total. This approach ensured a rich diversity of atmospheric scenarios across a wide range of hurricane intensities.
From the simulation output, the key variables which compose the model inputs are extracted: vertical brightness temperature
(monodimensional), horizontal brightness temperature
(monodimensional), Doppler velocity
(two-dimensional), and reflectivity
(two-dimensional), with a horizontal spacing of 1 km and a vertical spacing of roughly 70 m. In addition, the simulator outputs the “true” vertical component of the wind field,
, projected along the instrument LoS. To obtain the physical vertical velocity,
is divided by
, reflecting the nominal incidence angle of the conically scanning beam (
Figure 1a).
An example of a simulated overpass is illustrated in
Figure 3 and
Figure 4.
Figure 3 is a zoomed version that highlights the WIVERN dense sampling strategy. The total water path (TWP) is indicative of areas with deep hydrometeor layers and the presence of vertically extended liquid water columns due to strong vertical air motions. Superimposed, the WIVERN scan, color-coded with the H channel brightness temperature
. In magenta, contours of actual convective regions are highlighted, where the maximum wind speed along the column exceeds 3 m/s.
Figure 2.
Representation of an overpass of WIVERN over Hurricane Milton in the Gulf of Mexico for the 10/07 at 23:52 UTC. The track of Hurricane Milton is shown, color-coded to the intensity of the storm, from 5 October 2024 at 18:00 UTC to 10 October 2024 at 00:00 UTC. In white, cloud data was plotted using geostationary infrared data for the day of 8 October 2024 at 00:00 UTC. The square marker indicates the closest position of the satellite on the ground with respect to the cyclone eye (diamond marker). The cyan dashed line is the satellite ground track, while the gray dotted line represent the conical scan (the whole swath is highlighted by the shadowed region in cyan). The curtains for relevant quantities outputted by the simulator along the sector highlighted in yellow are plotted in
Figure 4.
Figure 2.
Representation of an overpass of WIVERN over Hurricane Milton in the Gulf of Mexico for the 10/07 at 23:52 UTC. The track of Hurricane Milton is shown, color-coded to the intensity of the storm, from 5 October 2024 at 18:00 UTC to 10 October 2024 at 00:00 UTC. In white, cloud data was plotted using geostationary infrared data for the day of 8 October 2024 at 00:00 UTC. The square marker indicates the closest position of the satellite on the ground with respect to the cyclone eye (diamond marker). The cyan dashed line is the satellite ground track, while the gray dotted line represent the conical scan (the whole swath is highlighted by the shadowed region in cyan). The curtains for relevant quantities outputted by the simulator along the sector highlighted in yellow are plotted in
Figure 4.
Figure 4 shows the three main radar products (
,
and
) and corresponding relevant quantities (TWC,
, and
) along a segment of the WIVERN slanted vertical cross-section. In
Figure 4a,b, the eye and eyewall of the hurricane are clearly identifiable at the centre of the plot. The eye is characterised by a column of clear air with near-zero wind velocity, while the surrounding eyewall is marked by high TWC, which leads to signal extinction in the
field. Generally,
correlates well with TWC: above the freezing level (approximately 5 km altitude) and large anvil clouds containing high ice water content are visible, while rain bands with abundant low-level hydrometeors are evident at lower altitudes near the eyewall regions. Even within the eye, low clouds occasionally contribute to weak radar returns at lower levels.
Figure 4c,d show that the Doppler velocity signal is dominated by horizontal winds, displaying the characteristic dipole pattern of cyclonic circulation—winds of opposite sign appear on either side of the eye. In
Figure 4f, convective vertical motions are apparent in the eyewall region, while below approximately 5 km, the vertical velocity is enhanced by precipitation fall speeds. Finally,
Figure 4e reveals that highly convective regions are typically associated with lower brightness temperatures, consistent with deep, cold cloud tops.
In total, the dataset spans a distance of nearly
km along the satellite ground track. For storage and mini-batching purposes, the data curtain is divided into 10,912 non-overlapping segments, each covering 500 km along-track. All variables are stored in NetCDF4 format using 32-bit floating-point precision.
Figure 4 illustrates an example of a single chunk extracted from the full dataset.
Although the dataset, based on simulations of Hurricane Milton, provides a meteorologically rich test bed with prominent convective activity, convective regions remain a minority class within the full 5.5-million-kilometre dataset. Across the entire record, only 0.62% of the data correspond to areas with vertical wind speeds exceeding 3 m/s, with an additional 3.67% falling within the range of 1–3 m/s. This reflects the well-documented sparsity of strong updraughts and downdraughts compared with widespread stratiform regions. To counter the resulting class imbalance, we adopt a weighted binary cross-entropy loss, assigning each convective pixel ten times the weight of a stratiform pixel. A complementary probability-threshold calibration at inference further balances detection and false-alarm rates (see
Section 3.5).
3. Methodology
3.1. Link Between WIVERN Observables and Convective Identification
All three WIVERN observables (94 GHz reflectivities, line-of-sight Doppler velocities and 94 GHz brightness temperatures) contain valuable information related to the presence of convection. This statement is supported by insights gained from previous missions such as CloudSat and ongoing missions like EarthCARE, both of which employ cloud radars operating at the same frequency.
94 GHz Reflectivity Profile. Numerous studies have demonstrated that high radar reflectivity values near cloud tops observed by the CloudSat CPR (e.g., values exceeding 10 dBZ above 10 km altitude) are effective indicators of convective cores [
16,
17]. Within CloudSat data, three commonly applied criteria are used to identify deep convection:
- 1.
The CPR cloud mask (2B-GEOPROF product) must exceed a value of 20.
- 2.
There must be a continuous radar echo extending from below 2 km to above 10 km in altitude.
- 3.
The echo-top height of the 10 dBZ reflectivity contour must exceed 10 km. A reflectivity of 10 dBZ is typically considered a proxy for the presence of precipitation-sized particles in convective clouds [
39]. The extent to which such large particles are lifted towards the cloud top serves as an indirect measure of updraught intensity [
40].
An example extracted from CloudSat data is shown in
Figure 5. The black dots indicate locations along the CPR data that meet the deep convection. The CPR profiles corresponding to two such profiles are shown in
Figure 5c (dashed lines). There are distinct differences among these profiles with the blue line one having a clear signature of multiple scattering [
15] with no evident transition between the solid and liquid phase at the melting layer and the cyan one; on the other hand, they have a sharp transition at about 4 km with a large reflectivity gradient between 1 and 4 km, a signature of rain attenuation [
41]. Profiles in more stratiform regions (red and magenta continuous lines), on the other hand, show a strong positive vertical gradient of reflectivity below the freezing level.
94 GHz Doppler Velocity. Observations from nadir-looking airborne radars (e.g., Heymsfield et al. [
42]) and the recently launched EarthCARE mission [
43] have revealed increased variability in vertical Doppler velocities in the presence of convective motions. In such environments, strong updraughts and downdraughts often occur in close proximity, resulting in significant spatial variability in the Doppler velocity measurements. The EarthCARE Doppler radar, which is nadir-pointing, provides direct measurements of the vertical Doppler velocity (
in Equation (
1)). Recent findings by Galfione et al. [
43] confirm that this variability is a reliable indicator of convective activity.
In contrast, WIVERN performs conical scanning at an incidence angle of 42°. Although the vertical component of the wind is attenuated by a factor of 0.74 due to the projection onto the line of sight (as described in Equation (
1)), the LoS Doppler velocities from WIVERN will still be sensitive to the rapid fluctuations in vertical velocity (
w) commonly found in convective regions.
94 GHz Brightness Temperature. Previous studies employing microwave radiometers have demonstrated that the presence of precipitation-sized ice particles leads to a depression in brightness temperatures (
) at higher frequencies (≥37 GHz), relative to the warmer background [
44,
45,
46]. Among the various types of ice particles, graupel play a key role in causing this depression at 94 GHz. Their presence in the atmospheric column is linked to the riming process, which is typically intensified by strong updraughts [
47,
48].
This characteristic is evident in CloudSat
observations:
Figure 5a clearly shows a substantial drop in brightness temperature below 200 K in the vicinity of the convective core, while 94 GHz
measurements from CloudSat have been used to advance understanding of ice microphysics [
49]; it is surprising that they have not been widely exploited in studies of deep convection. Our simulations further support these findings, with
depressions reaching values below 100 K in intense convective cores.
Taken together, these results highlight the significant potential of all three WIVERN observables—94 GHz radar reflectivity, Doppler velocity, and brightness temperature—for convection identification. A distinct advantage in WIVERN’s case is that all three measurements are beam-matched, ensuring spatial and angular consistency in their retrievals.
3.2. Convective/Stratiform Mask
In our simulation framework, the first step involves estimating the WIVERN sampling volume-averaged vertical air motion. This is achieved by applying the antenna pattern weighting function to the modelled vertical velocities within the radar’s sampling volume. The resulting averaged vertical air motion serves as the reference truth for subsequent analysis.
Importantly, our models are not trained to reproduce the exact magnitude of the vertical velocity. Instead, the reference vertical velocity is first transformed into a smoothed convection–stratiform mask, which serves as the training target. The models are then trained to learn this classification structure rather than predict precise velocity values.
For each pixel-value
w of the reference vertical velocity matrix, a corresponding target mask value
m is obtained by mapping it to
with a linear rule as follows:
The choice of these threshold values is due to 1 m/s being the threshold commonly used in the literature to separate stratiform and convective motions [
50], while 3 m/s can be considered a value that corresponds to moderate convection. Thus, the region in between can be considered a region of transition between the two regimes. Note that absolute values of
w are considered, hence ignoring the vertical motion direction (i.e., no distinction between updrafts and downdrafts).
m is set to
in regions that produce reflectivities below WIVERN sensitivity (−25 dBZ). The resulting mask is a floating-point values image whose values are between 0 and 1, whereas values close to 1 mark vigorous convection.
3.3. Pipeline Overview and Network Architectures
Each 500 km segment of data is stored as an individual NetCDF file, which contains the four input channels—vertical and horizontal brightness temperatures, reflectivity, and Doppler velocity—along with the computed target mask.
To ensure consistency in data representation, the brightness temperature variables are tiled into 2D tensors, providing a common spatial shape across all input channels.
The dataset is divided into a training set and a cross-validation set using a 9:1 ratio. Reflectivity values below −25 dBZ are capped at −25 dBZ, and min–max normalization is applied across all variables to standardize the input range for model training (
Figure 6).
Three encoder–decoder U-NET variants have been implemented. Details of their architectures are provided in
Table 1.
All models employ bilinear up-sampling, skip concatenations, a 1 × 1 output convolution, and optional sigmoid activation for inference. Dropout is injected at the bottleneck to mitigate overfitting.
3.4. Training Setup
Overall, the training dataset consisted of 10912 samples, for a total of approximately 5.5 millions km and 45 GB of simulated track data. The learning rate was set at 0.0001. With a batch size of 20, utilizing a single NVIDIA A40 GPU, the duration for training spans from 3 to 4 h for Mini configurations and extends up to 24 h for large setups.
3.5. Inference and Evaluation Metrics
During evaluation the same cleaning and normalization steps are applied. The network is then executed in sigmoid mode; the raw logits from its final
convolution are passed through the sigmoid function as follows:
turning every pixel into a calibrated probability in the interval
. Running the model in sigmoid mode produces true probabilities that can be directly compared with the continuous target mask or thresholded for ETS, POD, FAR and
. For measuring performance, non-thresholded metrics were employed (Mean Absolute Error (MAE), Mean Squared Error (MSE), and Binary Cross Entropy Loss (BCE)), defined as:
where
N is the number of valid pixels in each image,
is the network output, and
the target mask.
For further performance assessment, our task is reformulated as a binary classification problem, and the following four thresholded metrics are computed: Probability of Detection (POD), False Alarm Rate (FAR), Equitable Threats Score (ETS), and F1-score.
A lightweight postprocessing routine is applied: for every pixel, the average of the mask values inside a window ( wide, tall) is computed. If that local mean exceeds the cross-validated threshold , the pixel is flagged as convective; otherwise, it is labelled as stratiform, to return a discrete representation in which each pixel has been assigned a value of either 0 or 1 (or ). After thresholding, a confusion matrix with true positives(TP), false positives (FP), false negatives (FN), and true negatives (TN) is obtained. All four metrics above are built from these counts.
4. Case Studies
The performance of the model is illustrated through three case studies.
Figure 7 presents a 500 km slice through the simulated Hurricane Milton, capturing a broad stratiform shield with embedded convection between approximately 150 km and 320 km along-track. The brightness temperature field (upper-left panel) exhibits sharp drops only in narrow bands, suggesting the presence of isolated deep convective towers embedded within an otherwise extensive anvil. This structure is corroborated by the reflectivity panel (centre-left), which reveals a broad layer of 0–15 dBZ reflectivities spanning altitudes of 6–15 km. The Doppler velocity panel (bottom-left) displays a characteristically noisy pattern, yet clear upward motion signatures (yellow-red) can still be identified near the convective tower cores. The C/S mask (middle-right) translates these dynamical indicators into a continuous convective index, which peaks at unity in regions where
m/s.
The U-Net reconstruction (bottom-right) captures both the location and vertical extent of these convective cores with high fidelity. Notably, the major updraughts between 180 and 200 km and 240–270 km are recovered with near pixel-perfect accuracy. Some minor discrepancies remain, however, as the network exhibits a slight over-dilation of the convective areas.
Figure 8 illustrates the result of transforming the soft convection index into a binary classification field. The impact of this operation is evident in the upper-left panel: fine filaments visible in the raw mask (see
Figure 7) are eliminated, while the principal convective core is consolidated into a solid, contiguous structure. Applying the same thresholding process to the U-Net output (upper-right panel) yields a similarly coherent reconstruction.
The lower panel combines the two post-processed masks into a four-colour confusion map. True positives (red) dominate the convective core, indicating that the network not only identifies the convective region correctly, but also captures its full vertical extent. True negatives (blue) are prevalent across the stratiform canopy, confirming that the model exhibits a low false alarm rate. Most classification errors manifest as a narrow yellow halo of false positives surrounding the edges of the convective towers. False negatives (green) are absent in this particular example and were observed only rarely across the entire test set.
Figure 9 presents case study #2, which features a more fragmented convective structure compared to case study #1. The scene includes a chain of convective bursts embedded within a broad stratiform shield. The brightness temperature trace (upper-left panel) shows repeated dips between 130 km and 320 km, indicating the presence of multiple overshooting tops rather than a single, well-defined eyewall. The reflectivity field confirms this pattern, revealing narrow columns exceeding 15 dBZ embedded within an expansive 5–15 dBZ stratiform layer, which deepens from around 5 km on the left to over 15 km at 400 km along-track.
The Doppler velocity panel displays corresponding streaks of intense upward motion (yellow–red), flanked by weaker downdraughts—typical signatures of pulse-type convection. The C/S mask successfully isolates the convective cores, assigning the surrounding ice clouds to the stratiform category. The U-Net reconstruction accurately retrieves all major convective cores and even captures the wispy overshooting feature near 150 km. However, it also introduces several small “satellite” blobs that remain below the threshold in the ground truth.
After applying the sliding-window post-processing filter (
Figure 10, top row), the predicted convective canopy appears smoother, and many of the spurious speckles disappear. The confusion map (bottom row) shows large true-positive regions (red) along the main convective towers, reflecting excellent recall. False positives (yellow) tend to appear around tower flanks and some mid-level anvil regions, while false negatives (green) are concentrated in a few narrow vertical spires, suggesting that the model occasionally underestimates the extent of very slender cores. Nevertheless, the prevalence of true positives and true negatives across the scene confirms that overall precision remains high, despite the scene’s structural complexity.
Finally,
Figure 11 and
Figure 12 presents an additional case (case study #3), sampling a broad stratiform shield interrupted by a single intense convective tower, in contrast to the chain of smaller cells seen in previous cases. The brightness temperature panel reveals a sharp, V-shaped plunge of nearly 200 K centred around 70 km along-track. The reflectivity field confirms the presence of a narrow convective column extending above 17 km, with significant attenuation beneath it. The Doppler velocity panel supports this scenario, displaying a distinct needle-like vertical structure at the same location.
The U-Net reconstruction accurately predicts both the along-track position and the vertical extent of the core, while correctly identifying the downwind anvil as stratiform. Minor artefacts appear as faint streaks above 14 km, likely reflecting overconfident predictions of weaker convective activity. After post-processing, the filtered output maps are visually almost indistinguishable; however, the confusion image reveals subtle differences. The convective column is classified almost entirely as true positive (red). A thin halo of false positives (yellow) surrounds the top of the tower, indicating that the network is slightly more inclusive than the ground truth in classifying the anvil fringe. False negatives (green) are absent in this case.
Overall, the three case studies demonstrate the strong performance of the U-Net architecture in C/S classification, effectively handling both isolated and embedded convection scenarios.
To ensure comprehensiveness in the analysis,
Figure 13 and
Figure 14 illustrate an additional case study, evaluated across all the published model sizes:
CONSTRAINN-Mini,
CONSTRAINN-Medium, and
CONSTRAINN-Large. It is noted that while all models exhibit comparable performance, an increase in model size results in a gradual reduction in reconstruction error.
6. Conclusions
This study introduces CONSTRAINN, a family of U-Net models trained on simulated data replicating the expected measurements from the WIVERN mission, to deliver a continuous, physically interpretable index of convective activity. By converting simulated vertical winds into a continuous convective/stratiform mask and by fusing Doppler velocity, reflectivity and brightness temperature information, the approach offers a reliable methodology to estimate vertical wind speed, as required by the mission’s Level-2 retrieval chain. On the Hurricane Milton benchmark, a mean squared error of 0.38% is achieved, with an ETS of 60%, a POD of 98% and a FAR of 18%. It is worth noticing that, given that the convective pixels exceeding 1 m/s make up about 3.6% of our data, this FAR mostly reflects the network dilating real convection cores by a few pixels rather than fabricating artificial convective regions, an error that is considered acceptable within our application domain.
The current models are hurricane-focused and do not distinguish between downdrafts and updrafts. Future works might include generalizing the presented models to directly retrieve vertical wind velocity, including its sign. Moving toward a broader range of applicable scenarios, a natural improvement consists of testing the current architecture to outputs of different storm-resolving models (e.g., ICON, RAMS), for tropical cyclones and extending the architecture to observational scenarios such as mid-latitude systems, meso-scale convective systems, frontal systems, or polar lows. Furthermore, the performance of the U-net based CONSTRAINN model might be evaluated against other deep learning techniques.
A next step might involve applying CONSTRAINN to real satellite observations, including EarthCARE, and in the near future, INCUS. Anticipated challenges in this transition include managing instrument noise, calibration uncertainties, footprint mismatches, and ensuring robust domain adaptation from simulated to actual measurements. Successfully addressing these challenges would significantly advance the creation of a unified convection classifier for next-generation spaceborne atmospheric radars.