*2.5. Basic Principle of Wavelet Transform*

In the process of time-series data acquisition, there will be some noise in the time series data due to observation error, systematic error, or other reasons, and the noise will seriously affect the data processing results. Therefore, in the data preprocessing stage, different methods should be selected to denoise the data according to the type of noise. Common denoising methods include the Fourier transform [67], the wavelet transform [68], etc.

The Fourier transform is a widely used analysis method in the field of signal processing. It converts a time domain signal into a frequency domain signal. Its basic idea is to decompose the signal into the superposition of a series of continuous sine waves with different frequencies. However, Fourier transform also has many disadvantages. The traditional Fourier transform can only realize the overall transformation between the signal time domain and the frequency domain and cannot distinguish time-domain information. However, Fourier transform is only suitable for stable signals; most signals have variability, which significantly limits the application of Fourier transform.

The basic idea of wavelet transform is to adaptively adjust the time-frequency window according to the signal, decomposing the original signal into a series of sub-band signals with different spatial resolutions, frequency characteristics, and directional characteristics after stretching and translating. These sub-bands have good local characteristics in both the time and frequency domains and can therefore be used to represent the local characteristics of the original signal, thus enabling the localization of the signal in time and frequency. This method can overcome the limitations of Fourier analysis in dealing with non-smooth signals and complex images.

The mathematical definition of wavelet is as follows: let *<sup>ψ</sup>* ∈ *<sup>L</sup>*2(*R*) ∩ *<sup>L</sup>*(*R*), which Λ *ψ*(*ω*) 

is almost always 0 on R and satisfies *<sup>C</sup><sup>ψ</sup>* = <sup>+</sup><sup>∞</sup> −∞ <sup>|</sup>*ω*<sup>|</sup> *<sup>d</sup>ω*, then *<sup>ψ</sup>* is the wavelet, where Λ *ψ*(*ω*) = <sup>√</sup><sup>1</sup> 2*π* +∞ −∞ *ψ*(*t*)*e*<sup>−</sup>*irdt* is the Fourier transform of *ψ*. Wavelet transform is one order of magnitude faster than fast Fourier transform. When the signal length is *M*, the computational complexity of Fourier transform is *Of* = *M*log2*M* and that of wavelet transform is *OM* = *M*.

Wavelet transform can be divided into continuous wavelet transform (CWT) and discrete wavelet transform (DWT).

The formula of continuous wavelet transform is:

$$\mathcal{W}\_f(a,b) = \frac{1}{\sqrt{a}} \int\_{-\infty}^{+\infty} f(t) \overline{\psi(\frac{t-b}{a})} dt \tag{2}$$

where *Wf*(*a*,*b*) is the continuous wavelet coefficient, *a* is the scaling factor, *b* is the translation factor, *ψ*(*t*−*<sup>b</sup> <sup>a</sup>* ) is the conjugate function of *<sup>ψ</sup>*(*t*−*<sup>b</sup> <sup>a</sup>* ), and *f*(*t*) represents the original data. The scale of wavelet transform is controlled by adjusting the values of *a* and *b* to realize the adaptive time-frequency signal analysis.

The discrete wavelet transform formula is:

$$\mathcal{W}\_f(j,k) = \int\_{-\infty}^{+\infty} f(t) \frac{\overline{\Psi(\frac{t}{a\_0 j} - kb\_0)}}{\sqrt{a\_0 j}} dt \tag{3}$$

where *Wf*(*j*,*k*) is the discrete wavelet coefficient and *f*(*t*) is the original data.

The dbn wavelet is the most common wavelet transform and is mainly used in discrete wavelet transform. For wavelets of a finite length, when applied to fast wavelet transform, there will be a sequence composed of two real numbers. One is the coefficient of the highpass filter, which is called the wavelet filter, and the other is the coefficient of the low-pass filter, which is called the adjustment filter. Firstly, the wavelet transform decomposes the original data into the low-frequency wavelet coefficient *cAn* and high-frequency wavelet coefficient *cD*1, *cD*2,... , *cDn* by using the low-pass filter and high-pass filter, respectively. Among them, the low-frequency wavelet coefficient can be further decomposed and iterated several times until the maximum decomposition time is reached. Finally, the decomposed wavelet low-frequency signal and high-frequency signal are added to realize wavelet reconstruction. The formula is:

$$f(t) = cA\_n l(\psi\_{ik}(t)) + \sum\_{i=1}^n cD\_n h(\psi\_{ik}(t)) \tag{4}$$

where *f*(*t*) is the restored signal; *l*(*ψik*(*t*)) and *h*(*ψik*(*t*)) are the low-pass filter and high pass filter, respectively; *cAn* is low-frequency wavelet coefficient; and *cDn* is high-frequency wavelet coefficient.

The calculation steps of wavelet transform are as follows:

Step 1. Elect the wavelet function and align it with the starting point of the analysis signal.

Step 2. Calculate the approximation degree between the signal to be analyzed and the wavelet function at this time; that is, the wavelet transform's coefficient *C*. The larger the coefficient *C*, If the coefficient *C* is larger, the more similar the current signal is to the waveform of the selected wavelet function.

Step 3. Move the wavelet function to the right one-unit time along the time axis, and then repeat Steps 1 and 2 to calculate the transformation coefficient *C* until it covers the whole signal length.

Step 4. Scale the selected wavelet function by one unit, and then repeat Steps 1–4.

Step 5. Repeat Steps 1–4 for all expansion scales.

The selection of the mother wavelet type and decomposition level are the two most important problems in wavelet analysis. In this study, the db5 wavelet was used to decompose the experimental sequence for the following two reasons:


Because Jinjiang water quality data has obvious smoothing characteristics, the db5 wavelet analysis was the most suitable method for this study.

The maximum decomposition levels of wavelet can be calculated by the following Equation (5):

$$L = \ln(n\_d/(lw - 1))\tag{5}$$

where *lw* is the length of the wavelet decomposition low-pass filter and *nd* is the data length. In this study, *lw* = 23 and *nd* = 443 were selected, and *L* was calculated such that the number of wavelet decomposition layers was 3.
