2.1.1. Savitzky–Golay Filter

This filter is named after Abraham Savitzky and Marcel Golay, who first made it known as a solution to smooth out noise in data from a chemical spectrum analyzer. The filter falls into category of low-pass time domain filters that smooth out high data variability [16] and is used in many applications, such as electrocardiogram denoising [17], vegetation monitoring and GNSS-TEC changes. The filter operates by the convolution process, with least squares fitting of successive subsets in a given time window [18]. The formulae and detailed explanation are found in the original work of Savitzky, et al. [19].

#### 2.1.2. Polynomial Filtering

This kind of filter first approximates the entire data set by repeatedly evaluations at a given order. Given a time series F(t) function measured from series x1, x2, x3, ... .xi, the polynomial *P* is obtained in Equation (1) as

$$P\_n(X) = q\_1(X^n) + q\_2\left(X^{n-1}\right) + \dots + q\_n(X) + q\_{n+1} \tag{1}$$

where *qn* are quantities derived based on *P* using least squares and *n* is the order. The residuals, or the difference between F(t) and *P*(*X*), are computed to filter out gross effects from F(t), from which some useful information can be obtained. An example using TEC time series can be found in Rahmani, et al. [11].

### *2.2. GNSS Data*

The local GNSS network was used to characterize the TEC changes in Hong Kong. The network referred to as Hong Kong Satellite Reference (HK SatRef) covers the entire Hong Kong area. Information on the network is given in the work of Ji, et al. [20]. More details can be obtained from the Hong Kong Survey Department website (https://www. geodetic.gov.hk/en/rinex/downv.aspx (accessed on 14 June 2019)). The American Global Positioning System (GPS) constellation comprising thirty-two satellites was used. Figure 1 shows the study area and GNSS network.

**Figure 1.** Study area showing the Hong Kong GNSS network. GNSS receivers are colored triangles and circles with their names written beside them. The image shown was obtained from the Hong Kong Geodetic Survey Department website. (https://www.geodetic.gov.hk/en/satref/satref.htm, accessed on 14 June 2019).

In order to detect ionosphere irregularities caused by lightning, the well-known geometry free linear combination of pseudo- and carrier-phase signals was first used to compute the Slant TEC (STEC) from observations at a sampling interval of 30 s. An elevation cutoff angle of 15◦ was set to eliminate the multi-path effect [9]. The computed STEC was converted to Vertical TEC (VTEC) by applying a mapping function in Equation (2) below, where *Re* is the earth's radius, *θ* is the elevation angle at the ionosphere pierce point (IPP) of the signal–receiver path, and hi is the ionospheric single layer, approximated at 350 km.

$$\text{VTEC} = \sqrt{1 - \left(\frac{R\_t \cos \theta}{R\_c + h\_i}\right)^2} \,\,\,\,\text{STEC} \tag{2}$$

VTEC was then detrended using the two detrending methods stated above to get detrended TEC (DTEC), using Equation (3):

$$\text{DTEC}\_{model} = \text{VTEC}\_{org} - \text{VTEC}\_{model} \tag{3}$$

where VTEC*org* is the original VTEC, VTEC*model* is the VTEC obtained from the fitting model and DTEC*model* is the detrended TEC derived according to Equation (3). The unit of VTEC and DTEC is Total Electron Content Units (TECU; 1 TECU = 1016 e/m2 ). Orders of 3 and 6 and time window lengths 30, 60, 90 and 120 min [13,21,22] were selected for Savitzky–Golay, and orders of 3, 5 [23,24], 6, [1,12,25], and 10 [11] were used for polynomial fitting. The selected parameters for detrending are summarized below in Table 1.

## *2.3. Thunderstorm/Lightning Data*

At very low frequency and low frequency (VLF/LF), lightning discharges produce electric current in the lower D layer of the ionosphere [26]. Lightning data were obtained from a local VLF/LF network in the low-latitude region of Southern China. Total current generated is strongly correlated with lightning activity [27]; a day with a lightning count greater than 10,000 was deemed as a "lightning day".


**Table 1.** Detrending methods and their selected parameters.

#### *2.4. Selection Criteria*

A total of nine days in the months of July and August 2015 were used in this study. The days were grouped into three sets of three. The first (9th to 11th July) and third (1st to 3rd August) sets comprise three continuous non-lightning days before and after the second (17th to 19th July) set of lightning days respectively. The lightning counts for the days are as follows: 319, 1277 and 91 for 9th to 11th July; 200,435, 75,078, and 33,709 for 17th to 19th July; and 6875, 1775 and 1589 for 1st to 3rd August. All days were void of geomagnetic storm or solar condition events. The disturbance storm time (Dst) and solar condition index (F10.7 index) were less than −30 nT [28] and 150 [8], respectively. Figure 2 shows the Dst and F10.7 indices for the set of days.

**Figure 2.** Dst (panels **a**–**c**) and F10.7 (panels **d**–**f**) indices for the set of days. Dst index is greater than −30 nT and F10.7 index is less than 150 sfu, indicating the days were void of geomagnetic activity and solar condition.

In determining which detrending method was most suitable for detecting and distinguishing lightning days from non-lightning days, the detection and distinguishing conditions (2DC) approach was used. For the detection condition, because electrical discharge from lightning takes about three hours to travel to higher ionosphere heights, the changes in DTEC amplitude, mostly an increment, may be observed after 3 h of lightning occurrence and last 1–2 h. Being able to show this change using the DTEC method indicates that the DTEC method can detect lightning activity. For the distinguishing condition, anomalous behaviour of DTEC amplitude was checked on non-lightning days against that of lightning days. A non-lightning event day is expected to have one absolute maxima constant as the DTEC amplitude or value throughout, as there is no or little lightning or other space weather events to cause such changes. This constant value is then set as the threshold with which to assess lightning days. As indicated under the detection condition, lightning is expected to cause changes to

DTEC amplitude or value. The absolute maximum value of lightning days is compared to the threshold from a non-lightning day. When the increased value is greater than the threshold, a lightning day has either been distinguished from non-lightning days or not. A DTEC method achieving this is said to have both detected lightning activity and distinguished lightning days from non-lightning. Furthermore, 2DC is explained using the following example. Day 1 is a non-lightning day, Day 2 is a lightning day, and the detrending method is DM. First, DM is used to detrend TEC on Day 1. The DTEC on Day 1 was mostly between ± 0.5TECu. Next, TEC on Day 2 is detrended. DTEC on Day 2 was initially ±0.1TECu but at the time of lightning increased to ±0.5TECu. At this point, DM has met the detection condition, and hence is able to detect lightning, but not the distinguish condition, as the DTEC maxima for both days are the same. If Day 2 DTEC at time of lightning increased to, for instance, ±1TECu, DM would have successfully detected and distinguished Day 1 from Day 2. On the other hand, should Day 2 DTEC remain at ±0.1 throughout the entire period, DM could neither detect nor distinguish lightning events from non-lightning event. Figure 3 shows the flow chart of 2DC.

**Figure 3.** Flow Chart .showing the Detection and Distinguish Condition (2DC).

The evaluation was done on the basis of each satellite–receiver pair rather than an average of TEC over a station, in order to obtain greater detail. Satellites passing from the time of lightning occurrence to about 3 h afterwards were investigated.
