1. Introduction
Atmospheric water vapor plays an important role in atmospheric processes including atmospheric radiation balance, water cycles, the transfer of energy, and the formation of clouds and precipitation [
1,
2,
3]. Water vapor is also the most abundant and a critical component of the greenhouse gases driving global weather and climate changes [
4,
5,
6]. Integrated precipitable water vapor (IPWV, or PW for short) is the total water vapor contained in an air column from the Earth’s surface to the top of the atmosphere. About 45–65% of the PW is included from the surface to 850 hPa altitude (roughly 1500 m) and its distribution is highly variable in time and space [
7,
8]. Real-time and accurate estimates of characterization of the PW at high spatial and temporal resolution are now needed for severe weather forecasting and climate warming studies [
9,
10].
Many meteorological techniques that have been used for decades to retrieve PW, such as radiosondes, water vapor radiometer, and infrared sounders are expensive and may give poor data quality and poor spatial coverage [
11,
12,
13]. Besides, these traditional techniques measure the PW distribution only at coarse scales. These limitations form the main source of error in short-term precipitation forecasts [
14]. In contrast, PW estimates derived from Global Positioning System (GPS) measurements have been routinely done in the last twenty years with high internal repeatability, all-weather capability, and low cost [
15,
16]. The spatial resolution of PW by using GPS data depends on the number GPS receivers deployed on the Earth surfaces [
17,
18]. Nowadays, with the always increasing density of GPS sites, the application of ground-based GPS PW detection is strongly promoted in the meteorological field [
19]. Many studies are focused on the PW estimates obtained from GPS technique. Ohtani et al. (2000) have investigated the accuracy and features of PW estimates obtained from Japanese GPS networks and compared them with radiosonde observations [
20]. Kwon et al. (2008) have made a comparison between GPS PW estimates derived from a permanent ground-based GPS network on the Korean Peninsula and radiosonde measurements [
21]. Choy et al. (2015) have compared PW derived from grounded-based GPS receiver with traditional radiosonde measurement in Australia [
22].
GPS radio signals are delayed when they pass through the ionosphere (the upper part of the atmosphere) and the neutral atmosphere. Part of this propagation delay is caused by the atmospheric water vapor, which is mostly concentrated in the troposphere, and this delay can be estimated as a function of the temperature, pressure, and water vapor contents of the atmospheric layers [
23,
24]. It can be essentially reduced to the knowledge of the zenithal total delay (ZTD), itself divided into two components: zenithal hydrostatic delay (ZHD) and zenithal wet delay (non–hydrostatic) (ZWD) [
25,
26]. The GPS ZWD can be converted to a PW estimate by a conversion coefficient
, which is related to the weighted mean temperature of water vapor (
Tm) [
27,
28,
29]. A standard processing is to compute
Tm, then
and finally PW from ZTD minus ZHD. It is clear that, from a mathematical point of view, this standard processing defines an iterative procedure, as we need an a priori guess of the water vapor contents of the atmosphere to determine the PW estimate. In practice,
Tm is expressed as a linear function of surface temperature (
Ts).
The ultimate goal is to get the slope and intercept of this linear function almost site-independent. This linear relationship has been explored by many scientists, both globally and regionally, from an a priori knowledge of the water vapor contents of the atmosphere from radio soundings. In 1992, Bevis et al. [
30] came up with
from the analysis of 8718 profiles of radiosonde launches at 13 stations in the US over two years. This relation has been widely used in North Hemisphere mid-latitude regions between 27°N and 65°N, and was revised in 1994 by Bevis et al. [
31], from a nearly global distribution of about 250,000 radiosonde profiles. In 1998, Mendes et al. [
32] proposed another linear
Tm model by using about 32,500 radiosonde profiles over one year at 50 sites between 62°S and 83°N. A dedicated model is available for high latitudes [
33]. Liou et al. [
34], Bai [
35], Wang et al. [
36], Raju et al. [
37], Emardson et al. [
38] also established regional and global linear relationships between
Tm and
Ts [
39,
40,
41,
42].
Of course, if we have an on-site good estimate of both the PW from an external source (essentially radio soundings, but this can be also absorption lines of water vapor in the atmosphere or radiometric observations), and ZWD from a collocated GPS receiver, then and Tm can be directly derived. If, furthermore, we have a collocated weather station giving us Ts, then we can obtain by linear regression over a time series of Tm and Ts site-dependent values of the slope and intercept. This is basically the approach we applied in this paper, as we had the luxury to have a radio sounding station close to the Geodesy Observatory of Tahiti (OGT), with an International GNSS Service (IGS)-grade GPS receiver with weather data.
In this paper, the dataset and methodology are described in
Section 2. The comparisons of GPS-derived results with the Center for Orbit Determination in Europe (CODE) products [
43], as well as results obtained by using the Saastamoinen model are given in
Section 3. In
Section 4, we compare PW values downloaded from IGRA website and PW values computed from balloon raw data. The comparisons between our all-seasons
Tm model and our season-specific
Tm models with respect to the Bevis
Tm model take place in
Section 5. In
Section 6, we compare and analyze the GPS-derived PW estimates based on our new
Tm models with the corresponding RS PW estimates. Conclusions are presented in
Section 7.
4. Comparison of Level 1 RS PW Values with Level 2 RS PW Values
We will now assert the internal consistency of RS data. For this purpose, we recalculate in this section the local PW data from raw balloon measurements (level 2 RS PW in the jargon of the meteorological community, that we split into two subsets: a/level 2-A RS PW values, from the surface to the 500 hPa level and b/level 2-B RS PW values from the surface to the maximum altitude of the balloon, roughly 25,500 m in the mean) and make a comparison with the corresponding archived PW data from the IGRA website (level 1 RS PW). We emphasize that the IGRA archive contains only the level 1 data, that are highly compressed by the archive builders, which only retained the “characteristics” points of the raw data profiles, in the sense that the behavior of the data is identified as linear between these points. Besides, the archive builders limited, by construction, the maximum altitude of the data used to compute the archive from the surface to the 500 hPa level. In other words, they assumed that no PW was present beyond this level, or that this PW is irrelevant for meteorological studies. Still, in other words, the level 1 data should be, at least, consistent with the level 2-A if the compression algorithm is correct, and consistent with level 2-B if the assumption that no PW is present beyond the 500 hPa level is correct. This is the topic of the following sub-sections.
4.1. Comparison of Level 1 RS PW Values with Level 2-A RS PW Values (from the Surface to the 500 hPa Level)
In this section, we analyze and compare the reconstructed and archived PW from the surface to 500 hPa altitude with the level 2-A RS PW values we computed from the balloon raw data. We only used the 2014–2016 period, as instrumentation was changed at the end of 2013 at the Météo-France weather station. We recall that level 1 RS PW values are relative to the surface to the 500 hPa level and were downloaded from the IGRA archive, and that level 2-B RS PW values are relative to the surface to the maximum altitude of the balloon, far over the tropopause. We recomputed the level 2 data from raw balloon data stored in Météo-France archive, but here only up to the 500 hPa level (so-called level 2-A) to assert the reliability of archived level 1 PW data with regard to this threshold altitude.
Figure 4a shows the time series of level 1 RS PW and level 2-A RS PW and the corresponding monthly averaged values from 2014 to 2017.
Figure 4b shows the relationship between level 1 RS PW and level 2-A RS PW, and their least-squares fit is: level_1 = 0.95 × level_2-A − 2.29, with R
2 = 94.62%.
Figure 4a and
Table 3 show clearly that the two sets of values are consistent, with a small bias (−0.10 mm) and an RMS of 1.30 mm. this means that the compression algorithm used to save space on the IGRA archive (storage of characteristic points of data profiles) is working, but with only a 95% “performance” in our particular case.
4.2. Comparison of Level 2-A RS PW Values (from the Surface to the 500 hPa Level) with Level 2-B RS PW Values (from the Surface to the Maximum Altitude of the Balloon)
Now, we compare the same sets of level 2-A data taken from the ground to 5500 m with regard to level 2-B data, i.e., data taken from the ground to the maximum altitude of the balloon (over 25,500 m).
Figure 5a map level-2A versus level 2-B RS PW values over three years, with basically two launches of balloons per day. The level 2-B RS PW values are always larger than the corresponding level 2-A values. The statistic of their differences can be found in
Table 4. The bias is 2.75 mm, the STD is 2.32 mm and the RMS is 3.60 mm. In order to better understand the relationship between the balloon altitude and the PW, we map in
Figure 5b the level 2 data from a characteristic balloon launch at the Météo-France site, close to our observatory. The balloon reaches its maximum altitude, quasi-linearly with time, in about one hour. It is also clear from
Figure 5b that the PW contents, integrated from the surface to the current altitude of the balloon also reaches a horizontal asymptote, and that this asymptote is pretty close, but does not correspond, to the 500 hPa level. About 95% of the maximum value of the PW is acquired after 15 min of ascent, and the last 5% in the remaining part of the ascent.
Figure 5c shows the relationship between level 2-A RS PW values and level 2-B RS PW values. Their least-squares linear fit is: level_2_B = 1.09 × level_2-A − 1.35, with R
2 = 94.46%.
Figure 5d shows the relationship between level 1 RS PW values and level 2-A RS PW values, and their least-squares linear fit is: level_2-B = 1.14 × level_1 − 3.29, R
2 = 98.46%.
Figure 5c and its linear fit indicates that, in the mean, the level 2-A values, in our case, underestimate the PW by about 9% with respect to the level 2-B values. One can argue that this difference is caused by the balloon lateral drift that can reach tens of kilometers when the balloon bursts. But the direction of the drift is highly variable as a function of the wind conditions during the balloon ascent, and this drift is essentially the cause of the scatter of the data points (about 2000 balloon launches) in
Figure 5c. Therefore, our conclusion is still valid, in a certain sense. In
Figure 5d we did the comparison between and level 2-B. The conclusion is the same, but in this case we add the “noise” coming from the compression algorithm used to build the level 1 data. We emphasize that all these conclusions are only relative to our site, which presents high values of the PW in the atmosphere, up to 60 mm, and sometimes even more. The level of 500 hPa was fixed decades ago by meteorologists for weather forecasting in temperate areas, and it is simply too low for tropical areas, contrary to the opinion of [
57]. In tropical and equatorial areas, the water vapor layer reaches higher altitudes than in mid-latitudes areas, and up to 8 km in appreciable quantities [
58].
7. Conclusions and Outlook
In this research, we investigated, from a metrological point of view, the reliable retrieval of Integrated Precipitable Water Vapor (IPWV, or PW for short), from GPS data, in the tropical island of Tahiti. For this purpose, we did a cross-check between our GPS-derived estimates with the radio soundings data estimates from a nearby site, but also checked the internal consistency of the radio soundings estimates and of the GPS data estimates, in contrast to many studies that assume that the radio sounding data taken from the IGRA archive are “the” reference. In a first step, we checked the internal consistency of the modeling of the ZTD and ZWD estimates from our GPS data processing with respect to the CODE products and the Saastamoinen model. In a second step, we asserted the internal reliability of RS data. The first dataset we used, in this second step, is the archived PW data from the IGRA website (level 1 RS PW from the surface to the 500 hPa level). The second dataset, from the Météo-France archive, is the PW data from raw balloon measurements close to our site, sub-divided into two subsets: a/level 2-A RS PW, from the surface to the 500 hPa level and b/level 2-B RS PW from the surface to the maximum altitude of the balloon. Our analysis clearly indicates that the assumption of taking the IGRA archive as a reference can lead to an underestimation of the water vapor presents in the atmosphere in tropical areas (up to 9% in our case), and that only the RS level 2-B can be compared with the GPS-derived estimates in a consistent way. In a third step, we developed new models of the mean temperature Tm of the atmosphere with regard to its water vapor contents, based on a combination of level 2-B RS PW values and GPS ZWD values over Tahiti. The new Tm models based on all-seasons and different seasons (dry and wet) were derived by using meteorological data from the gridded VMF1 files. The results show that our all seasons Tm model and season-specific (dry and wet) Tm models are more reliable in our site (Tahiti) than the Bevis et al. model. In a fourth and last step, we compared our GPS PW estimates based on our new Tm models with level 2-B RS PW values. The results show that the PW derived with a “good” Tm model from GPS data is highly accurate, and as a final consequence confirms that the threshold of 500 hPa (5500 m) is the main source of bias between the GPS PW estimates and RS PW estimates taken from the IGRA archive in tropical areas.
Historically, the algorithms that are now widely employed to derive PW estimates from GNSS propagation delays can be traced down to the works of Owens in 1967 [
61] on the optical refractive index of air, then Thayer in 1974 [
62] on improved formulas, followed by Davis et al. in 1985 [
27] who introduced the concept of the mean temperature
Tm of the atmosphere with respect to its water vapor contents. The last essential brick to the edifice was laid down by Bevis et al. in 1992 [
30] in the form of a linear relationship from the surface temperature
Ts at the GNSS receiver site to the mean temperature
Tm of the overlying column of the atmosphere and the introduction of the proportionality constant
between the PW estimate and the ZWD estimate. On the GNSS side, Marini in 1972 [
63] pioneered the introduction of the mapping functions, which are now culminating with mapping functions tailored to each site and to each epoch with the use of the VMF1 family of mapping functions and the online vmf1_g; the introduction of PPP orbits and precise clock information hammered down these sources of incertitude on the propagation delays [
48,
50]. A survey of mapping functions can be found in [
64]. Nevertheless, the analysis done in this paper reveals some drawbacks, which can be either corrected and/or investigated.
The first one, as mentioned above, is linked to the use, by most of the authors, of the IGRA database that assume an air column limited to the 500 hPa level, i.e., roughly 5500 m. As the scale height of the water vapor is about 2000 m worldwide [
65] with regard to a scale height of 8000 m for the whole atmosphere, that sounds reasonable, but this measurement threshold is probably too low in tropical and equatorial areas, were most of the water vapor exchanges between the ocean and the atmosphere take place, as evidenced by our analysis of
Section 4. Secondly, the definition of
Tm, as given by Davis et al. [
27] is also derived under the hypothesis that water vapor contents are not so high [
27]. In the paper of Bevis et al. [
30], it is also apparent in their plot of
Tm versus
Ts that the scattering of these couples of values is larger for higher values of
Ts [
30]. This is evidenced by the study done in this paper in
Section 5, where we found different linear relationships between
Tm and
Ts, with a good correspondence with the radio soundings taken up to the maximum altitude of the balloons, always roughly around 25,500 m. We have no indication in the paper of Bevis et al. of the altitude threshold of the measurements for the determination of the PW estimates, but it was likely also 5500 m. We have to stress that our linear relationships were determined only over a three-year period, and besides, that they are relative to only an 8 K range between the extrema of temperatures (294 K to 302 K) found on the small island of Tahiti, which is surrounded by the thermal buffer of the deep South Pacific ocean, redarding the 80 K range of
Ts explored by Bevis et al. This, nevertheless, points to the fact that all the chain of state equations of the atmosphere and its optical properties must be revisited for high water vapor contents, not only by adding more GNSS and radio soundings observations in the tropical areas, but also by going back to the laboratory to check again the behavior of the atmosphere for high water vapor contents and maybe by redefining the concept of
Tm. This is of course far beyond the scope of this study. It is our opinion that trying to derive better empirical-only relations between
Tm and
Ts, for example by adding a seasonal variation (probably related to the lack of fit at the ends of the QQ plot line in
Figure 8), as done by Yao et al. [
29] and Bai [
35], is going nowhere without a firm understanding of the physics of the atmosphere for high water vapor contents.