*2.4. Artificial Neural Network (ANN)*

During the collection of time-series data, the loss of single or multiple attributes of some data in the final dataset or the loss of single or multiple records will be caused by acquisition, storage, and human error. These data are called missing data. The lack or incompleteness of data brings many difficulties to data mining, which will lead to the deviation of the analysis results and mislead users' decisions, resulting in adverse consequences. Therefore, filling the missing data completely under certain conditions is of great significance for macro data mining in big data scenarios. Nowadays, there are several ways to deal with missing data, such as the deletion method [49,50], missing value filling method based on a statistical model [51], or the method based on parameter estimation. This method first judges the missing mechanism of the missing value and then establishes a specific model to estimate the missing value. This method is widely used because it is more flexible in application and can be applied to datasets with a large number of missing values [52]. Common methods include the expectation maximization method, multiple filling method [53–57], maximum likelihood estimation method, etc. Austin et al. used multiple interpolations to estimate missing values in clinical medicine [58]. Chang et al. developed a distributed multiple filling method with communication efficiency to estimate the missing data in distributed health data networks (DHDNs) [59].

In summary, research on interpolation methods for missing values of time series has received increasing attention from scholars in various fields, and although some scholars have considered the correlation characteristics of time series, most of these studies have not quantified the correlation between the observed quantities. Although some scholars consider the correlation characteristics of time series, most of the studies are still based on traditional interpolation or regression analysis methods. Moreover, some traditional models, such as piecewise linear interpolation [60], cannot estimate the missing value well [61,62]. Therefore, with the development of machine learning, researchers can gradually apply various machine learning algorithms to the field of missing value filling, which can to some extent solve the problem of non-linearity that cannot be handled by traditional methods. Machine learning methods for missing value estimation include the KNN method [63], artificial neural network, etc.

Artificial neural network (ANN) is a classical fundamental technique in machine learning. Compared with general multi-factor prediction methods, its prediction method has the advantages of high fault tolerance, high reliability, and fast prediction speed. In addition, ANN is a powerful interpolation tool [64–66]. Artificial neural networks generally have more than three layers of multilayer neural networks, which generally include threelayer structures of input, hidden, and output layers, as shown in Figure 4.

**Figure 4.** Topology of neural network structure.

The relationship between the input *xi* and output *yi* of neurons is *yi* = *f*(*neti*), where *neti* = *XW* is the net activation, *X* = [*x*0, *x*1, ··· , *xn*] is the input vector, *W* = [*wi*0, *wi*1, ··· , *win*] *T* is the weight vector, and *f*(·) is the activation function, which represents the function of mapping the net activation and output. Some commonly used activation functions include *y* = *kx* + *c*, *y* = <sup>1</sup> <sup>1</sup>+*e*−*ax* , *<sup>y</sup>* <sup>=</sup> <sup>2</sup> <sup>1</sup>+*e*−*ax* − 1, etc.

A neural network can be divided into two states: learning state and working state. The learning state is used to adjust the weight of the neural network to make the output close to the actual value, while the working state uses the established network for classification and prediction without changing the weight of the neural network. The learning mode of the neural network is tutorial learning. The weight of the network is adjusted by the difference between the actual output and expected output of the network to make the model adapt as accurately as possible.

In this study, the MLP neural network was used to estimate the missing values from the water quality data of the Jinjiang River. The activation function of the output layer is constant. The single-layer perceptron is the simplest neural network, which is composed of input and output layers, and the input and output layers are directly connected. The MLP neural network contains an input layer, output layer, and several hidden layers, which is a kind of multi-layer feed-forward neural network based on BP algorithm training. The input signal is passed forward through the input layer to the hidden layer, and subsequently the neurons in the hidden layer are computationally processed and then passed forward to the output layer, which is a forward transmission process in which the output of the MLP neural network depends only on the current input and not on past or future inputs; thus, the MLP neural network is also known as a multi-layer feed-forward neural network. Among many neural network architectures, MLP neural networks are simple in structure, easy to implement, and have good fault tolerance, robustness, and excellent nonlinear mapping capability (Figure 5).

**Figure 5.** Topology of MLP neural network structure.
