3.1.2. Exception Data Handling

There are inevitably some anomalous data among the field monitoring samples. Anomalous data include outliers value and the missing value. The outliers are generally evaluated based on the statistical principle. In this study, the Grubbs criterion is applied to verify data and reject outliers one by one with a confidence level of 0.05, while the Grubbs criterion has a more rigorous result [23,24], particularly suitable for this application, within a sample size of 25 to 185 [25]. Taking the screened shedding unit data *Xc* = [*x*(*tcs*), *x*(*tcs*+*T*), ... ,*x*(*tcs*+*iT*), ... , *x*(*tcs*+*pT*)] as an example, the process of data exception handling is as follows.

(1) the sample mean and standard deviation are calculated by the Bessel Formula (11).

$$\begin{cases} \overline{\mathbf{x}} = \sum\_{i=1}^{p} \mathbf{x}(t\_{cs} + iT) \\\ \sigma = \sqrt{\frac{1}{p-1} \sum\_{i=1}^{p} \left( \mathbf{x}(t\_{cs} + iT) - \overline{\mathbf{x}} \right)^{2}} \end{cases} \tag{11}$$

(2) Reorganize the samples in order of numerical size {*x*(*t*) 1, *x*(*t*) <sup>2</sup> ... , *x*(*t*) *<sup>p</sup>*}, where *x*(*t*) <sup>1</sup> ≤ *<sup>x</sup>*(*t*) <sup>2</sup> ≤ ... *<sup>x</sup>*(*t*) *<sup>p</sup>*. The residual error of suspicious outliers vi is calculated by the Formula (12).

$$v\_i = \max\{ (|\mathbf{x}(t)^1 - (\overline{\mathbf{x}})|), (|\mathbf{x}(t)^p - (\overline{\mathbf{x}})|) \}\tag{12}$$

(3) Compare the suspicious residual error vi with the critical value of Grubbs coefficient *G*0. If Formula (13) is satisfied, the data is determined to be an outlier and removed from the data sequence. The new sequence is then retested until there are no outliers in the data series. *vi*

$$\frac{\omega\_l}{\sigma} > G\_0(n, \kappa = 0.05) \tag{13}$$

(4) Stuff the rejected outliers and missing values by the nearest neighbor method [26], as shown in Formula(14).

$$\mathbf{x}\_{i} = \frac{1}{2}\mathbf{x}\_{i-1} + \frac{1}{2}\mathbf{x}\_{i+1} \tag{14}$$
