3.1.1. Weight Estimation

The main idea of weight estimation is that a small weight is given to the measurement with a large residual, which in turn will reduce the influence of the outlier. Therefore, it is crucial to establish the relationship between weights and residuals.

However, the residuals of the real model cannot be obtained directly, so the weights are also difficult to be determined. In this work, the AE source is located by the LLS method, and each measurement is treated with equal weights firstly. Then, the residuals of the fitting model can be obtained and used to replace the residuals of the true model:

$$
\Delta\_{\bar{i}} = b\_{\bar{i}} - A\_{\bar{i}} \theta\_{\prime} \quad \text{i} = 1, 2, \cdots, n \tag{6}
$$

where, Δ*<sup>i</sup>* is the *i*th element of a residual vector of the fitting model.

After obtaining the residuals, the weight estimation is developed by establishing the relationship between weights and residuals. The simplest method is to exploit the reciprocal value of the residuals as the weights directly. However, when a residual tends to be zero, the corresponding measurement is assigned with an unreasonably high weight, resulting in overfitting. Therefore, the weight should have a maximum upper limit to avoid overfitting, which can be realized by setting a lower limit of *λm*. Second, residuals exceeding the threshold *λ<sup>L</sup>* are considered outliers, and the relative measurements are also regarded as outliers and filtered by setting the weight to zero. Therefore, the weight estimation can be expressed as follows [52]:

$$w\_i = \begin{cases} 1/\lambda\_{m\_\prime} \text{ if } |\Delta\_i| \le \lambda\_{m} \\ 1/\Delta\_i, \text{ if } \lambda\_m < |\Delta\_i| < \lambda\_L \\ 0, & \text{else} \end{cases} \tag{7}$$

To determine the specific parameters of the weight estimation, it is assumed that residuals obey the normal distribution. Therefore, the threshold *λ<sup>m</sup>* and *λ<sup>L</sup>* in (7) can be defined by:

$$
\lambda\_m = k\_m \sigma
\tag{8}
$$

$$
\lambda\_L = k\_L \sigma \tag{9}
$$

where *km* and *kL* are the correlation factors, and *σ* is the estimated value of the standard deviation of residuals, with expression as follows:

$$\sigma = \sqrt{\frac{1}{\frac{1}{n-m} \frac{\sum\_{i=1}^{n} w\_i (b\_i - A\_i \theta)^2}{\frac{1}{n} \sum\_{i=1}^{n} w\_i}}} \tag{10}$$

where *m* is the number of the unknown parameters. The denominator <sup>1</sup> *n n* ∑ *i*=1 *wi* is used to standardize the weights.

Figure 1 shows the regional division of residuals according to the normal distribution. The parameters *km* and *kL* divide the complete residuals into three regions, including minor residuals, medium residuals, and outlier residuals. Besides, the corresponding weights transformed from these residuals are divided into three parts, including an equal weight part, a variable weight part, and a zero weight part, as shown in Figure 2. The first part has the equal and highest weights out of the consideration that at least half of the measurement data are accurate. Therefore, *km* = 0.765 is selected, so that 50% of the normal residuals fall in the equal weight area. To avoid the unbalance of weighting, *λ<sup>m</sup>* should have a lowest limitation equal to 0.05max(|Δ*i*|) when there are many minor residuals. The weights of the second part varies with the relative residuals according to Equation (7). The third part is the zero weight region. Any observations from the *n* residuals that lie inside the probability band (| | > *kLσ*) can be considered outliers. Besides, the corresponding measurements are regarded as outlier measurements, which are filtered on the following calculation by setting the weights to zeros.

**Figure 1.** Residual division according to the normal distribution. In this figure, PDF is the normal probability density function and CDF means the cumulative distribution function.

**Figure 2.** Change of weights with the increasing absolute residuals and the probability of the absolute residuals less than *kσ* (| | < *kσ*).

To determinate the parameter *kL*, half of the residuals are considered to be situated in the third region (i.e., outside of the probability band | | ≤ *kLσ*). Therefore, the probability band (| | ≤ *kLσ*) should only account for *<sup>n</sup>* <sup>−</sup> <sup>1</sup> <sup>2</sup> residuals [53]. The probability *Pin* equal to *<sup>n</sup>* <sup>−</sup> <sup>1</sup> <sup>2</sup> out of *n* residuals is looked for, which can be expressed as:

$$P\_{\rm in} = 1 - \frac{1}{2n} \tag{11}$$

where *Pin* is the probability band, and *n* is the number of residuals.

The possibility *Pin* for the residual falling in the interval of −*kLσ* to *kLσ* can be obtained by the integral of the probability density function:

$$\begin{split} P\_{\text{int}} &= \int\_{-k\_L \sigma}^{k\_L \sigma} \frac{1}{\sqrt{2\pi \sigma}} e^{-\frac{\left(\Lambda - \mu\right)^2}{2\sigma^2}} d\Lambda \\ &= 2 \int\_0^{k\_L \sigma} \frac{1}{\sqrt{2\pi} \sigma} e^{-\frac{\left(\Lambda - \mu\right)^2}{2\sigma^2}} d\Lambda \\ &= 2 \left[ \frac{1}{\sqrt{2\pi} \sigma} \sqrt{\frac{\pi}{2}} \sigma \cdot \text{erf}\left(\frac{\Lambda - \mu}{\sqrt{2\pi}}\right) \right]\_0^{k\_L \sigma} \\ &= \text{erf}\left(\frac{k}{\sqrt{2}}\right) \end{split} \tag{12}$$

where *erf* is the error function with expression erf(*t*) = <sup>√</sup><sup>2</sup> *π kL<sup>σ</sup>* <sup>0</sup> *<sup>e</sup>*−*<sup>t</sup>* 2 *dt*.

Therefore, the threshold parameter *kL* determined by *n* can be obtained from Equations (11) and (12).

$$k\_L = \sqrt{2} \cdot \text{erfinv}\left(1 - \frac{1}{2n}\right) \tag{13}$$

where *erfinv* is the inverse error function corresponding to the *erf* function.

#### 3.1.2. Iteration between Weight Estimation and Source Location

Preliminary weights can be obtained after the initial location with equal weights, while they may not be the optimal. First, the location with equal weights is likely to be inaccurate or even incorrect, and the standard deviation σ of the residuals is always large, especially when outliers exists. In this case, it is difficult to distinguish the outliers and minor residuals from all of the observations, because both the thresholds *λ<sup>m</sup>* and *λ<sup>L</sup>* are high. Second, the residual *<sup>i</sup>* does not reflect the deviation of the real model; it only describes the deviation of the fitting model. The residuals and the estimated weights will be both incorrect, if the location result of the fitting model deviates from the true source greatly. Therefore, further iterations between weight estimations and source locations are required [54].

In the process of iterations, the source location at each step needs to solve a weighted least square solution:

$$\theta = \arg\min \sum\_{i=1}^{n} w\_i (b\_i - A\_i \theta)^2 = \arg\min \sum\_{i=1}^{n} w\_i \Delta\_i^2 = \left(A^T \theta A\right)^{-1} A^T \mathcal{W} b \tag{14}$$

where *W* is the diagonal matrix of weights, and all of the elements of *wi* are updated according to Equation (7).

After several iterations, the optimal weight estimation is obtained, and all of the outliers are identified and filtered.
