**1. Introduction**

The thickness and spatial extent of sea ice are key geophysical parameters, whose retrieval by remote sensors (i.e., L-band passive microwave radiometers) has been carried out a number of times since the late 1970s [1,2]. Thin sea ice is important for climate change due to its dominance over the ocean-atmosphere heat exchange in the Arctic region [3]. The Sea Ice Thickness (SIT) can be indirectly retrieved by measuring the freeboard using laser and radar altimeters [4–7]. The thin SIT can also be estimated using ice surface temperatures from thermal infrared imagery with some limitations [8]. Low frequency (e.g., 0.6 GHz) passive radiometry was first proposed for SIT retrievals by [9], and more recently with L-band passive radiometry [10–13].

SIT retrievals based on freeboard measurements show their merits for developed ice (>1 m) while having limitations for thin ice and reduced spatial and temporal resolutions [6]. The passive microwave radiometric approach at low frequency is better suited for thin ice (<0.5 m) and can produce daily maps of the Arctic regions [11–13]. Nonetheless, passive microwave radiometry observations are limited to the cold season since melting ice produces artefacts in the retrievals [14]. Therefore, these two types of retrieval methodologies are complementary [10].

There are currently two operating L-band passive microwave radiometer satellite missions: the Soil Moisture and Ocean Salinity (SMOS) and the Soil Moisture Active Passive (SMAP) missions. The SMOS satellite was launched in November 2009, carrying a synthetic aperture passive microwave radiometer operating in the L-band at 1.4 GHz (*λ* = 21 cm) [15]. The SMAP mission was launched in 2015, carrying a radar and an L-band radiometer sharing a 6 m antenna reflector [16]. SMOS provides multi-angular observations, with a spatial resolution of ∼30 km × 30 km at nadir and a swath width of approximately 1200 km [15]. SMAP provides observations at fixed incidence angle of 40◦, with a spatial resolution of 36 km × 47 km and a swath width of 1000 km [16]. These missions were originally designed for inferring soil moisture content and sea surface salinity from the surface emissivity [17–19]. However, [10] described the sensitivity of passive L-band brightness temperature measurements to sea ice properties, mainly to SIT, but also to sea ice temperature (*Tice*) and salinity (*Sice*).

There are two types of Passive Microwave Radiometry (PMR) retrieval methodologies for SIT: on the one hand, the methods based on a combination of thermodynamic and radiative transfer models (e.g., University of Hamburg SIT), which account for variations of ice salinity and temperature [12,20]; and on the other hand, empirical algorithms, such as the one developed by the University of Bremen, calibrated using training data from ice growth models [13,21].

The available ground truth limited the assessment and validation of current PMR thickness products [14,22]. Currently used validation datasets are acquired during the late winter and early spring periods [14,23]. Furthermore, they are restricted to specific acquisition dates in different locations, thus limiting the possibility of performing a comprehensive quality assessment of the product. This is due to the very restrictive operational constraints of helicopter-based campaigns. As such, the surveyed ice mostly corresponds to well-developed late-winter (thick) sea ice, not suitable for L-band SIT retrievals. Furthermore, late-winter SIT presents a more disperse distribution as compared with autumn-forming ice. Therefore, at scale of the satellite resolution, the late-winter in-situ SIT measurements are little representative of the satellite-derived SIT scales. The lack of suitable in-situ ground truth data for validation purposes stressed the need for more field campaigns [14]. Authors from [22] and [14] performed comparisons between satellite and airborne Electromagnetic Induction (EM) systems. However, the already mentioned sampling problems (i.e., late-winter acquisitions and lack of a time-continuous record) are still present in all the validation datasets.

The moored Upward Looking Sonar (ULS) from the Beaufort Gyre Exploration Project (BGEP) is considered to be the reference method for SIT estimation [7]. Its year-round sampling of SIT in a fixed location allows for a continuous record of the SIT during the freeze-up period provided that there is enough sea ice drift [24]. The available ULS data, which can be collocated with 7 years of SMOS-coincident measurements, have better temporal coverage than SMOS although with very limited spatial extent (that of the place where the buoy is moored). However, thanks to the drifting sea ice above the mooring, the ULS is actually sampling a wide variety of sea ice conditions, much more representative of sea ice geophysics than the other auxiliary datasets. This enables a spatio-temporal re-sampling of the validation dataset, which makes the ULS data resolution equivalent to that of the satellites. As such, the ULS turns out to be an excellent tool for the assessment of the L-band satellite-derived thin SIT products [12,13].

To the best of our knowledge, this is the first study aiming at comparing ULS with PMR SIT. Due to the limited amount of moorings, the ULS data are insufficient to carry out a thorough analysis of current PMR SIT products. However, available data can be used to verify the agreement between in-situ data (ULS) and modeled data (Cumulative Freezing Degree Days (CFDD)) during the sea ice growth season. Then, the calculated CFDD SIT is used to carrying out a more in-depth analysis of L-band spaceborne SIT products.

This analysis sheds light on the performance and limitations of current PMR ice thickness retrieval algorithms.

The structure of the paper is as follows: in Section 2 the data used in this study are described. Then, in Section 3 the retrieval algorithms for generating SIT from L-band satellite data are presented and discussed. In Section 4 the methodology for the characterization of the different SIT datasets is introduced, while the analysis results are discussed in Section 5. Finally, the concluding remarks can be found in Section 6.

#### **2. Data**

#### *2.1. SMOS Data*

In this study, the data from the official SMOS Level 1B product version 504 acquired north of 60◦ between 2010 and 2017, are analyzed. The L1B dataset contains the Fourier components of the Brightness Temperatures (TB) at the antenna reference frame. By applying an inverse Fourier transform, we obtain TB snapshots (i.e., an interferometric TB image) [25]. TBs are referenced using an Equal-Area Scalable Earth (EASE) Northern Hemisphere grid of 25 km resolution. The SMOS TB radiometric uncertainty is ∼2 K at boresight, although it degrades on the extended alias-free field of view [26].

The TB measurements are corrected for standard contributions such as atmospheric attenuation and geomagnetic and ionospheric rotation [17]. The galactic reflection correction, not significant at high latitudes, was not applied. A 3-*σ* filtering is applied using the radiometric uncertainty as *σ* at all the points in the antenna plane. TB measurements are also filtered in regions of the field of view that are known to have low accuracy due to Sun reflections, Sun tails and aliasing effects [27].

The TB measurements from ascending and descending orbits are averaged over periods of 3 days to reduce the noise level. The acquisitions are also averaged by incidence angle bins of 2◦. The SMOS geometry and the presence of interferences can cause missing observations at some incidence angles. A cubic polynomial fit to interpolate TB measurements is used to obtain TB samples over the full range of incidence angles at each grid point [25].

#### *2.2. SMAP Data*

The SMAP platform is equipped with an active (Synthetic Aperture Radar or SAR) and a passive (radiometer) microwave system, operating at L-band. The SAR system aimed at improving the quality and resolution of the radiometric signal. However, the radar high-power amplifier had a problem on 7 July 2015 causing a halt in data transmission [28]. Currently, the SMAP products are based on the low-resolution radiometer data [28]. The satellite Equator crossing times are at 6:00 p.m. and 6 a.m., for the ascending and descending nodes respectively, with a revisit of 2–3 days [16]. The global daily SMAP Level 3 V5 gridded TB product on EASE2 grid (36 km resolution) is used in this study. The SMAP *L*3\_*SM*\_*P* product is downloaded from the National Snow and Ice Data Center (NSIDC) [19]. The TBs within the SMAP L3 product are given at the surface level. Therefore, they are corrected, using near surface information, for atmospheric and sky radiation contributions [29].

#### *2.3. Moored Upward Looking Sonar*

SIT measurements from in-situ moored ULS are available since the early 1990s [24]. The measuring unit from the BGEP consists of several instruments: an Acoustic Doppler Current Profiler (ADCP), an ice-profiling sonar (IPS) with a pendulum and pressure sensors. The ADCP is an echo sounder

that measures motion by using the Doppler shift of a target [30]. The IPS measures the distance to the ice-water interfaces. The sonar is made up of an amount of transducers that shape a fine acoustic beam of 2◦ at −3 dB [24]. This sharp beam permits sampling a target at 50 m distance with a ∼2 m resolution. The pendulum and pressure sensors allow us to estimate the tilt of the beam and the depth of the moored ULS, respectively. Inferring the depth from the pressure implies the knowledge of the atmospheric pressure which must be subtracted (pressure from a nearby weather station often suffices) [24]. The sampling rate is 0.5 Hz, generating files with more than 40,000 samples per day. The uncertainty of the SIT measurement is of 5–10 cm, and depends on a number of factors [24].

Three in-situ moorings located in the Beaufort Sea (see Figure 1) are used in this study. Data collected between 2010 and 2017 during the freeze-up period of the year; i.e., October–January, are analyzed. The total surveyed track length, assuming an ice drift of ∼8 km/day on average, is ∼20,000 km. This constitutes an unprecedented source of information for validation of current PMR thickness retrievals. Thickness is measured in a Eulerian context (i.e., fixed moorings), thus enabling the analysis of ice thickness development during the freeze-up period. This type of retrievals allow a year-round cost-effective acquisition of ice thickness as opposed to helicopter-based campaigns [14].

#### *2.4. Ancillary Data*

Sea Ice Concentration (SIC) maps from the database of the Ocean and Sea Ice Satellite Application Facility (OSI SAF) of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) are used in this study. SIC is estimated from TB observations from the Special Sensor Microwave Imager/Sounder (SSM/I and SSMIS) at 19 and 37 GHz which are corrected for atmospheric effects using the European Centre for Medium-Range Weather Forecasts (ECMWF) model output [31].

Sea Ice drift data from the Polar Pathfinder Sea Ice Motion Vectors [32] available at the NASA NSIDC are also used in this analysis, in particular, Version 4, which includes an improved filtering of the SSMI inputs and updates, i.e., input buoy data and input motion vectors. Ice motion estimates are derived from a number of satellite sensors such as the Advanced Very High Resolution Radiometer (AVHRR), the Advanced Microwave Scanning Radiometer—Earth Observing System (AMSR-E), the Scanning Multi-channel Microwave Radiometer (SMMR), SSMI and SSMI/S sensors, the International Arctic Buoy Programme (IABP) and NCEP/NCAR Reanalysis forecasts [32].

Finally, NCEP/NCAR Reanalysis 1 surface air temperatures [33] at 2.5 degrees resolution in latitude and longitude, with a mean absolute error of ∼0.25 and 1.25 degrees Celsius in summer and winter, respectively [34], are used in this analysis. NCEP data together with the empirical Cumulative Freezing Degree Days (CFDD) model are used to calculate SIT [35,36].

$$SIT[m] = 1.33 \star (CFDD[^{\circ}C])^{0.58},\tag{1}$$

The following procedure is applied in order to obtain SITs from surface temperatures. Sea ice freezing temperature is subtracted from the available NCEP/NCAR surface air temperatures. We apply the modulus function to the resulting temperatures. The resulting value is added from the beginning of the freezing period (i.e., end of September-beginning of October) to the date under consideration. This cumulative quantity is used as an input in Equation (1) to obtain the desired SITs.

#### **3. L-band Passive Radiometer Ice Thickness**

Two algorithms for SIT retrievals have been proposed in the literature: one empirical and another one based on the thermodynamic/radiative-transfer model. The empirical methodology was firstly proposed by [11]. It consists of an exponential function that relates the measured intensity or First Stokes parameter (i.e., the average of the horizontally and vertically polarized TBs) with SIT. The intensity is used because of its better quality and signal preservation (e.g., less affected by Faraday rotation) as compared to the co-polarized TBs.

The L-band SIT empirical retrieval is based on Equation (2):

$$\begin{aligned} T\_{obs} &= T\_1 - (T\_1 - T\_0) \exp(-\gamma d) \\ d \pm \delta d &= -\frac{1}{\gamma} \ln(\frac{T\_1 - T\_{obs} \pm \delta T\_{obs}}{T\_1 - T\_0}) \end{aligned} \tag{2}$$

where *d* is the inverted ice thickness, *γ* the attenuation factor, *T*<sup>1</sup> and *T*<sup>0</sup> the saturated thick ice and the open water signals respectively, and *Tobs* the observed brightness temperature.

This algorithm has been further developed by [13], including a new fitted function describing the Polarization Difference (PD) dependence on SIT. Also the incidence angle used is different: while [12] uses the incidence angle range of 0–40◦, [13] uses the observations retrieved at 40◦. Additionally, [13] relies on ice growth models for generating the training dataset. More recently, a method that combines the SMOS and SMAP TBs to produce a state-of-the-art SIT product has been developed [21]. The empirically-based product used in this study is produced and supported by the University of Bremen. They currently distribute a version ingesting the SMOS TBs observations. The retrieval is limited to 50 cm ice thickness [13,21]. Allegedly, the SIT error increases with ice thickness and it could amount up to 50% of the retrieval values [21].

The thermodynamic/radiative-transfer model methodology was first developed by the University of Hamburg [12]. The ice thickness is estimated using an iterative approach due to the dependence of SIT on temperature and ice salinity [12]. The ice thickness estimated by [11], which is calculated with average values of *Tice* and *Sice* for the Arctic, is used as a first guess in the initial iteration. The temperature of the ice pack is assumed to be the average of the sea water freezing temperature and the surface air temperature from atmospheric reanalysis data [12]. Sea ice salinity, also required by the radiation model, is estimated with the empirical function defined by [37] using the sea surface salinity based on climatology as input. The thickness retrieval convergence is based on the comparison between the observed intensity at nadir (from 0 to 40◦ incidence angle) and the modeled intensity from an adaptation of the radiation model by [9]. The obtained thickness field is further refined by applying a lognormal distribution [38] that takes into account the effect of subpixel-scale heterogeneity at SMOS resolution [12]. The limits of this type of retrieval are variable depending on the temperature and salinity of the ice pack. It can reach up to 1.5 m under very special conditions, i.e., low temperatures and low salinities, only attainable at closed seas or water masses with large amounts of fresh water supplies (e.g., Baltic Sea) [12]. The SIT uncertainty increases with thickness and it can be larger than the estimated thickness [12].

The presence of snow over sea ice modulates the L-band signature. The wet snow acts as an absorbent at this frequency, thus suppressing the sea ice thickness sensitivity when wet snow is present. This is the reason why the SMOS SIT maps are not produced from mid April to mid October. On the other hand, the dry snow produces a bias in the TB, especially in the H-polarization TB as already stated in [39].

#### **4. Methodology**

The ULS data shows its suitability as SIT reference dataset or ground truth when checking its consistency for the continuous sampling of the ice pack during the freeze-up period. The ground truth data acquisition relies on the drifting nature of the sea ice [24]. The ULS thickness reference is re-sampled into the EASE grid for comparison against L-band radiometer thickness products and CFDD SIT based on NCEP/NCAR surface reanalysis data. To produce a suitable ground truth for comparison against satellite measurements, the ULS data are time averaged over consecutive 24-h spans, i.e., the same time resolution of the daily L-band thickness products.

The total amount of measured samples (which equals the number of measured days during the freeze-up period) was ∼2500 which multiplied by the average daily sea ice drift gives a figure of approximately 20,000 km of sampled ice thickness. We imposed a filtering criterion over the time averaged ULS ground truth. The filtering was performed in a two step approach. Firstly, leads were discarded from the statistics whenever their presence in the daily sample was above 5%. Secondly, the daily standard deviation of the ULS dataset with a threshold of 0.2 m was used as a filtering criterion. The latter implied the filtering of approximately ∼ 30% of the ULS measured ice thickness. The filtered ULS ice thickness showed its suitability as ground truth given its accuracy, consistency and continuous record of the ice thickness.

To ensure the statistical representativeness of the derived ground truth, we perform a quality control of the ULS data based on three consistency criteria. First, the ice profile sample (dependent on the amount of ice drift) must be long enough; we imposed a minimum of ∼100 days of sampling which translated into ∼800 km of sampled ice thickness. Second and third, the variability of the SIT profile and SIT field in the perpendicular direction to the profile, respectively, must be low enough; here again, we imposed a threshold of ∼0.2 m in the ice thickness standard deviation. A mean ice drift of ∼7–8 km day−<sup>1</sup> for the entire sampling period was estimated on the location of the moorings using both the in-situ ADCP [24] and the NSIDC sea ice velocity record [32]. The variability of the temporally-averaged thickness was calculated with the daily standard deviation of the dataset. Finally, the latter value was used as a proxy for estimating the variability of the ice in the perpendicular direction of the sampled profile.

Figure 1 shows the Buoy C and the CFDD SIT evolution over the freeze-up period in 2015. There is a good match between the ULS thickness data and the CFDD modeled SIT. The temporal evolution of the SIT ground truth shows an increasing standard deviation of the ice thickness daily mean while spring approaches. The thickness values stop increasing at the beginning of spring. As soon as this happens, the ice thickness daily mean begins to change more rapidly. Note also that the SIC increases rapidly at the beginning of the freeze-up period. The former facts indicate that the most suitable time for carrying out a calibration or assessment of L-band ice thickness is precise during the freeze-up period, after the SIC reaches nominal values around 95%, and before the in-situ ice thickness loses its representativeness of satellite resolution scales (i.e., beginning of spring).

We performed the comparison between the daily SIT ground truth and the University of Hamburg (UH) and University of Bremen (UB) SIT products. The UH product provides an uncertainty and a saturation field to inform the users about the quality of the product. As such, the UH SIT data were filtered by using these two parameters following [2]. The filtering was performed using a 1-m uncertainty and a 90% saturation ratio (i.e., the ratio of the estimated thickness to the maximum theoretical retrieval). This filtering procedure reduces the number of ground truth match-ups to approximately one third of its original size, thus triggering the need to extend the validation dataset using model-based SIT data. The comparison between the ground truth and the CFDD SIT (see Equation (1)) shows that the latter is in a first approximation a good estimate of the former as a reference dataset.

The use of CFDD model-based SIT facilitated the extension of the validation. However, this extension was grounded on the homogeneity hypothesis (i.e., the sea ice growth conditions in other regions are considered similar to those found at the buoys' locations). This hypothesis can only be valid during the freeze-up period when the ice growth is fast and the impact of the ice drift is lower on the CFDD modeled ice thickness.

The scatter plot between the ULS-derived ground truth collected over 7 years during the freeze-up periods (from 2010 to 2016) and the co-located CFDD SIT is shown in Figure 2. The scatter indicates that CFDD SIT is a good proxy of the ground truth having only small deviations from the identity line, the regression factor is ∼0.99 ± 0.01 at 95% confidence level. Indeed, the CFDD SIT represents quite well ice thickness up to 0.3 m. For larger SIT values, a slight overestimation of the CFDD data is noticeable. We attribute this overestimation to the inability of the model to represent events of strong

ice drift that inhibit ice growth for several days. This good correlation between both datasets does not hold for the late winter-early spring months (see Figure 1). Therefore, we only use the CFDD SIT as proxy of the ULS ground truth during the freezing season (i.e., from October to January), and in regions with thin first-year sea ice.

**Figure 1.** (**a**) Location of Woods Hole Oceanographic Institution (WHOI) buoys in the Beaufort Sea. The map uses the Arctic Polar Stereographic projection (EPSG: 3995) [40]. (**b**) Temporal evolution of the ice thickness measured by Buoy C (red dots) and modeled by Cumulative Freezing Degree Days (CFDD) (black line) over the freeze-up period in 2015. (**c**) Comparison between the modeled CFDD and Upward Looking Sonar (ULS) ground truth gathered during the first three months of the freeze-up period.

**Figure 2.** Comparison of the modeled CFDD Sea Ice Thickness (SIT) and the ULS ground truth. Contour lines in scatter plots herein describe the normalized density surface. Contours show a 20% step increase in density in all cases. Pearson's correlation coefficient used herein is defined as the covariance of two variables divided by the product of their standard deviations.

#### **5. Results and Discussion**
