*Article* **Study on the Removal of Iron and Manganese from Groundwater Using Modified Manganese Sand Based on Response Surface Methodology**

**Han Kang \*, Yan Liu, Dan Li and Li Xu**

School of Municipal and Environmental Engineering, Shenyang Jianzhu University, Shenyang 110168, China **\*** Correspondence: hj\_kh@sjzu.edu.cn; Tel.: +86-1389-793-1713

**Abstract:** This study used modified manganese sand as an adsorbent to explore its adsorption effect on iron and manganese ions from groundwater. The effects of pH, manganese sand dosage, and the initial concentration of Fe/Mn on the removal rate of iron and manganese ions were studied through single-factor experiments. Based on the above three factors, a quadratic polynomial model between the adsorption rate and the above factors was established to determine the optimal adsorption conditions. The response surface analysis showed that pH had the most significant effect on the adsorption process. The optimum conditions for the adsorption of iron and manganese ions by modified manganese sand were pH = 7.20, the dosage of manganese sand = 3.54 g/L, and the initial concentration ratio of Fe/Mn = 3.80. The analysis of variance showed that the RSM model could accurately reflect the adsorption process of manganese sand. In addition, we confirmed that the relative error between model predictions and experimental values was close to 1%, proving that the response surface model was reliable. The kinetic data of the manganese sand were described well with the pseudo-second-order model. The isothermal adsorption of iron and manganese ions by modified manganese sand was fitted well using the Langmuir equation.

**Keywords:** modified manganese sand; iron and manganese ions; response surface methodology

#### **1. Introduction**

Iron and manganese are natural components in the crust. High levels of iron and manganese in groundwater are common. In China, 20% of groundwater resources have excess iron and manganese [1]. In water supply networks, iron and manganese in tap water are oxidized to high valence during disinfection. The oxide precipitation formed in the pipeline is easily adsorbed into the water supply network, affecting the quality of the drinking water supply [2]. Although iron and manganese are necessary trace elements for the human body, drinking high-iron and -manganese surface water or groundwater for a long time will lead to chronic poisoning and damage to human health [3–5]. Physiologically, a large amount of iron ingested by the human body cannot be removed through metabolism. Excessive iron accumulation will induce diabetes, skin diseases, and other diseases, while excessive manganese can cause pathological changes in human organs and even cause neurotoxicity [6–8].

The coexistence of iron and manganese is common in groundwater. In recent years, how to efficiently and stably remove iron and manganese has become the focus of research. Iron usually exists in a soluble ferrous (Fe(II)) state. The most stable oxidation state of manganese is +2 valence [9,10]. The traditional removal methods of Fe(II) and Mn(II) in groundwater include natural oxidation, biological, and adsorption [11–14]. The natural oxidation method is to oxidize Fe2+ to Fe3+ via aeration and then generate Fe(OH)3 precipitation. The removal of manganese requires adding alkali based on aeration to improve pH. The procedure flow of this method is complex, and the high pH in effluent needs acidification treatment, which increases the treatment cost and management difficulty.

**Citation:** Kang, H.; Liu, Y.; Li, D.; Xu, L. Study on the Removal of Iron and Manganese from Groundwater Using Modified Manganese Sand Based on Response Surface Methodology. *Appl. Sci.* **2022**, *12*, 11798. https://doi.org/ 10.3390/app122211798

Academic Editors: Xin Zhao, Lili Dong and Zhaoyang Wang

Received: 24 October 2022 Accepted: 17 November 2022 Published: 20 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The biological method mainly depends on microorganisms to reduce iron and manganese concentration in groundwater. However, the metabolism of microorganisms is influenced by the oxygen content, and insufficient oxygen slows down the metabolism of microorganisms [15]. Although oxidation and biological methods are widely used, their disadvantages are unstable effluent quality and difficulty to control reaction conditions. In contrast, the adsorption method has the advantages of large capacity, less energy consumption, and less pollution [16,17]. It is widely used in removing iron and manganese from groundwater and is considered one of the most effective methods to remove iron and manganese [18,19]. Manganese sand is the most widely used adsorbent for water adsorption of iron and manganese ions [20].

This study aimed to determine the best conditions for the adsorption effect of modified manganese sand. Response surface methodology can effectively optimize the optimal process parameters and evaluate the interactions between various influencing factors [21,22]. We established a response surface model (RSM) based on single-factor experiments to determine the optimal adsorption factors. The model helped analyze various factors affecting the experiment in a limited number of experiments. Meanwhile, we analyzed the interaction between the factors and proved the rationality of the response surface model through experiments.

#### **2. Materials and Methods**

#### *2.1. Experiment Material and Equipment*

A standard solution of iron and manganese ions, hydrochloric acid (Sinopharm Chemical Reagent Co., Ltd., Shanghai, China), NaOH (Tianjin Bodi Chemical Co., Ltd., Tianjin, China), and manganese sand (Shenyang Keer Automation Instrument Co., Ltd., Shenyang, China) were used in this study. MnO2 content was 40%, iron content was 15%, SiO2 content was 18%, MnC2 content was 27%, the solubility of hydrochloric acid was <3.5%, and the particle size was 0.9–1.7 mm. A p611 pH tester (Shanghai Youke Instrument Co., Ltd., Shanghai, China) and a UV-5500 ultraviolet-visible spectrophotometer (Shanghai Yuan Xi Instrument Co., Ltd., Shanghai, China) were used for the analyses.

#### *2.2. Preparation of High-Efficiency Manganese Sand and Determination of Removal Rate*

Studies have shown that Mn2+ is very stable under acidic conditions and difficult to remove through oxidation. Nevertheless, modified manganese sand can reduce the lower limit of pH in the manganese removal. In order to obtain high-efficiency manganese sand, we adopted the impregnation method to modify manganese sand [23]. Untreated manganese sand (0.9–1.7 mm) was cleaned with deionized water to remove the impurities on the surface and then placed in a drying oven at a constant temperature of 100 ◦C. The dried manganese sand was put into a 500 mL beaker and mixed with 5% potassium permanganate solution, and the mixture of manganese sand and the modifier was then heated at 50 ◦C for 16 h in a constant-temperature water bath. Then, it was dried again for 12 h in an oven at 100 ◦C to obtain the modified manganese sand.

To confirm the effect of adsorption, we studied the removal rate of the adsorption process and configured several water samples with different iron and manganese ion concentrations. Briefly, 2 mg/L iron together with the manganese ion water sample and the manganese sand adsorbent were added to a 250 mL conical flask. The pH of the solution was adjusted with 0.1 mol/L NaOH and HCl. After sealing, it was placed in a constant temperature shaker at 25 ◦C with a rotating speed of 120 r/min. For the sampling after adsorption, a 0.45 μm filter membrane was used. The absorbance was measured at 510 nm via o-phenanthroline spectrophotometry (detection limit: 0.03 mg/L; the relative standard deviation of the laboratory was 0.44%). The absorbance was measured at 525 nm through potassium periodate spectrophotometry (detection limit: 0.05 mg/L; the relative standard deviation was 3.94%). The iron and manganese ion removal rate is calculated according to Formula 1.

$$R = \left(\frac{\text{C}\_{\text{O}} - \text{C}\_{\text{e}}}{\text{C}\_{\text{O}}}\right) \times 100\% \tag{1}$$

where C0 is the initial concentration of adsorbate, mg/L. Ce is the final concentration of adsorbate, mg/L.

#### *2.3. Characterization of Properties*

The sample pore size distribution and other parameters were determined using an ASAP 2020 physical adsorption meter manufactured by an American microphone instrument company. Nitrogen adsorption was determined at 77 K and within the range of 10−3~1.0 relative pressure (p/p0), using nitrogen as the adsorbing medium. The sample was degassed for 2 h at 300 ◦C before the test. The micromorphology of manganese sands before and after modification was observed using an S-4800 scanning electron microscope (Hitachi Limited, Tokyo, Japan).

#### *2.4. Single-Factor Experimental Design*

In order to obtain reasonable experimental factors, we designed single-factor experiments, which provided guidance for the design of the response surface experiments. The single-factor experiments were based on three influencing factors: pH (4, 5, 6, 7, 8, 9); manganese sand dosage (1.5, 2, 2.5, 3, 3.5, 4, 4.5 g/L); and initial concentration of Fe/Mn (2, 4, 6, 8, 10). The concentrations of iron and manganese ions were 1 and 0.5 mg/L, 2 and 0.5 mg/L, 3 and 0.5 mg/L, 4 and 0.5 mg/L, and 5 and 0.5 mg/L, respectively. Three groups of parallel tests were set up to detect the concentration of iron and manganese in the water sample and calculate the removal rate. The appropriate value range of each factor was determined.

#### *2.5. Response Surface Experimental Design*

The interaction of pH, the amount of adsorbent, and the initial concentration ratio of iron and manganese ions on the adsorption of manganese sand was studied. The adsorption conditions were optimized within the experimental range, and the experimental design was optimized using the Box–Behnken response surface. Based on the single-factor experiments described in Section 2.4, the range of three input variables was determined: pH (A), manganese sand dosage (B), and the Fe/Mn initial concentration ratio (C). A central composite design was carried out for the adsorption experiment at three levels: low (−1), medium (0), and high (1). There were 17 groups of experiments; each group of experiments was repeated three times, and the average value was taken as the corresponding response value. Table 1 shows the coding values for each level.


**Table 1.** Variables and experimental design levels for the Box–Behnken design.

#### *2.6. Adsorption Kinetic Experiment*

Several solutions containing 2 mg/L iron and manganese ions were prepared. Briefly, 2 g of manganese sand was weighed for each sample and put into 250 mL conical flasks in an incubator. The temperature of the incubator was set to 25 ◦C, and the oscillation intensity was set to 120 r/min to ensure full contact between the modified manganese sand and the water sample. The samples were taken at 15, 30, 60, 120, 240, 360, 480, 600, and 720 min, respectively, to determine the concentration of iron and manganese ions in the filtrate. An adsorption kinetic model of manganese sand was established to explore the mechanism of iron and manganese removal.

#### *2.7. Adsorption Isotherm Experiment*

Briefly, 2 g of manganese sand was weighed in a 250 mL conical flask to prepare iron and manganese solutions with different initial concentrations. The iron and manganese solutions with different initial mass concentrations were poured into the water samples, and the pH value was adjusted to 7.2. The solutions were placed in a 120 r/min constant temperature oscillator, shaken, and adsorbed for 12 h, and then samples were taken to determine the concentration of iron and manganese ions in the filtrate. The Langmuir and Freundlich models were used to explore the mechanism of iron and manganese removal.

#### **3. Results and Discussion**

#### *3.1. SEM Results of Manganese Sand*

The unmodified (a) and modified (b) manganese sand were scanned with a scanning electron microscope. From Figure 1b, it can be seen that the manganese sand modified by potassium permanganate had a loose cluster distribution, and the roughness of the surface increased. There were many different sizes of pores on the surface of manganese sand, which may be one of the reasons for the improvement in its adsorption performance.

**Figure 1.** SEM of unmodified manganese sand (**a**) and modified manganese sand (**b**).

#### *3.2. Surface Area and Porosity Analysis*

Compared with unmodified manganese sands, the specific surface area, the pore volume, and the average pore size of the modified manganese sand samples in Table 2 increased by 26.9%, 41.7%, and 26.9%, respectively. During the modification process, the influence of the potassium permanganate solution on the internal thin layer and impurities of manganese sand samples changed the micropore structure of the manganese sand samples themselves, and through the connection of small pores, intermediate pores were formed. With the increase in the specific surface area and pore diameter, modified manganese sand was more beneficial to the diffusion and adsorption of iron and manganese ions.



The adsorption volume of manganese sand increased with the increase in relative pressure. When the relative pressure was about 0.5 in Figure 2, the curve was divided into two smooth curves. The hysteresis loop occurred at high pressure, and capillary condensation occurred. According to the figure, the nitrogen adsorption–desorption isotherms of manganese sand and modified manganese sand belonged to type IV isotherms in the BDDT

classification system. The adsorption–desorption curve revealed an obvious hysteresis phenomenon, which was the result of nitrogen capillary condensation in the mesopore. The adsorption effect of the modified manganese sand was improved, which showed that the pore expanded after being corroded by potassium permanganate.

**Figure 2.** Nitrogen adsorption–desorption curve.

*3.3. Single-Factor Experimental Results and Analysis*

3.3.1. Effect of pH on the Iron and Manganese Ion Removal

PH was one of the essential factors for iron and manganese ion removal. We studied the effect of the initial pH value on the adsorption process. As shown in Figure 3, the iron and manganese ions' removal rate showed an upward trend with the increase in pH. When the pH rose from 6.0 to 7.0, the iron ion removal rate rapidly increased from 69.8% to 80.4%, and the manganese ion removal rate rapidly increased from 62.3% to 75.5%. However, when the pH was greater than 7, the iron and manganese ion removal rate increased slowly, and the change was not significant.

The hydrated ion radius of H+ in water was much smaller than Fe2+ and Mn2+. Under acidic conditions, a large number of H<sup>+</sup> competed with iron and manganese ions for adsorption sites [24], resulting in the low removal efficiency of iron and manganese. At the same time, when the pH was too low, an iron filter membrane was easily formed on the surface of manganese sand, and iron infiltrated the filter layer and interfered with the formation of the manganese active filter membrane, thus affecting the manganese removal effect [25,26]. However, if the pH was too high, on the one hand, the iron ions in the solution precipitated in the form of hydroxide, which reduced the catalytic capacity, and on the other hand, it would inhibit the production of OH−. The above results showed that when the pH value was close to neutral, the removal rate of iron and manganese ions was significantly improved. Based on the consideration of economic factors and manganese sand adsorption conditions [27], the optimal pH value of manganese sand adsorption was 7.0.

the manganese ion removal rate.

3.3.2. Effect of Manganese Sand Dosage on the Iron and Manganese Ion Removal

It can be seen from Figure 4 that, with the increase in manganese sand consumption, the system had more adsorption sites, and the removal rate of iron and manganese ions gradually increased. When the amount of manganese sand increased from 1.5 g/L to 3.5 g/L, the iron removal rate increased from 61.6% to 84.5%, and the manganese removal rate increased from 61.1% to 74.7%. When the dosage continued to increase from 3.5 to 4.5 g/L, the iron ion removal rate slowly increased from 84.5% to 85.2%, and the manganese ion removal rate slowly increased from 74.7% to 75.8%. This showed that increasing the dosage of manganese sand has little effect on adsorption in this range.

 **Figure 3.** Effect of pH on the iron and manganese ions' removal. R1 is the iron ion removal rate; R2 is With the increase in manganese sand dosage and adsorption sites, the removal rate of iron and manganese ions increased. However, an excessive amount of manganese sand also aggravated the collision between the particles, which was not conducive to the adsorption of ions, and the maximum amount of the adsorbed substances when the unit mass of the adsorbent reached equilibrium was reduced. Considering process costs and the regeneration cost of manganese sand, we considered the amount of manganese sand used in this test to be controlled at about 3.5 g/L. Therefore, the optimal dosage of manganese sand was determined to be 3.5 g/L.

**Figure 4.** Effect of manganese sand dosage on the iron and manganese ion removal. R1 is the iron ion removal rate; R2 is the manganese ion removal rate.

3.3.3. Effect of Initial Concentration of Fe/Mn on the Iron and Manganese Ion Removal

As can be seen from Figure 5, with the increase in the initial concentration ratio of Fe/Mn from 2 to 4, the iron ion removal rate in the solution increased from 78.4% to a maximum of 82.9%. The manganese ion removal rate showed a decreasing trend, and the degree of decrease became obvious after the Fe/Mn ion concentration ratio was greater than 4.

The above phenomenon indicated that when the concentration of iron in the solution was small, the generated iron hydroxide acted as a coagulant and adsorbed iron, so the removal rate of iron was slightly increased. With the increase in the iron ion concentration, the system had a dynamic adsorption equilibrium. At the same time, there was a competitive adsorption relationship between iron and manganese ions. Excess iron ions dissociated in the solution and affected the removal of manganese ions. Therefore, a higher Fe/Mn concentration ratio was not favorable for the removal of Fe and Mn ions. Based on the above analysis, the reasonable initial Fe/Mn concentration ratio in this experiment was about 4.

**Figure 5.** Effect of the initial concentration ratio of iron and manganese ions on the iron and manganese ions' removal. R1 is the iron ion removal rate; R2 is the manganese ion removal rate.

#### *3.4. Analysis of Response Surface Experimental Results* Experimental Results

Based on the single-factor experimental results, 17 groups of comparative experiments were designed by using the response surface method. The experimental results are shown in Table 3.

Design-Expert 10.0 analysis was performed, and the experimental results are shown in Table 4. The quadratic polynomial regression model for the interaction between pH (A), manganese sand dosage (B), and the Fe/Mn initial concentration ratio (C) was obtained as follows:

$$\text{Y}\_{1} = 83.14 + 1.49\text{A} - 0.057\text{B} - 0.65\text{C} + 0.62\text{AB} - 0.41\text{AC} + 0.16\text{BC} - 1.42\text{A}^2 - 0.97\text{B}^2 - 1.18\text{C}^2 \tag{2}$$

$$\mathbf{Y}\_{2} = 76.04 + 1.36\mathbf{A} + 1.15\mathbf{B} + 0.13\mathbf{C} - 0.75\mathbf{A}\mathbf{B} - 0.70\mathbf{A}\mathbf{C} - 2.39\mathbf{B}\mathbf{C} - 3.16\mathbf{A}^{2} - 3.53\mathbf{B}^{2} - 2.89\mathbf{C}^{2} \tag{3}$$

where Y1 is the predicted Fe removal efficiency (%), Y2 is the predicted Mn removal efficiency (%), A is the pH value, B is the manganese sand dosage, and C is the initial concentration ratio of Fe/Mn.


**Table 3.** Experimental design and results.

The analysis of variance is an essential statistical method to test the significance and suitability of regression models. The results of the analysis of variance for the regression models of Y1 and Y2 obtained from the response surface are shown in Table 4. The F-value was used for statistical saliency detection, and the *p*-value was used for each regression coefficient. If the *p*-value in the model was smaller, the experimental results were more significant. In this experiment, the *p*-values of the two models were <0.01, indicating that the model had high significance and was suitable for the optimization study of this parameter. Similarly, the F-value in Table 4 showed that the quadratic model had certain sufficiency and significance. The *p*-values of the lack of fit for both models were greater than 0.05, showing that the model was consistent in the regression study. As can be seen from the data in Table 4, the correlation coefficients of the predicted value of the model's removal rate and the experimental value were 0.9972 and 0.9747, respectively. This could better reflect the removal effect of iron and manganese ions. The values of C.V. were 0.16 < 10% and 1.20 < 10%, respectively, proving that the experiment had good accuracy and reliability. In the model, the values of Adeq precision referred to the effective signal-to-noise ratio and were considered reasonable if they were greater than 4. The Adeq precision values of the two models, as shown in the table, were 45.006 > 4 and 15.392 > 4, which means the model had high accuracy [28]. In conclusion, the results of this analysis revealed that the model could replace the real point of the test to analyze the results.

**Table 4.** Analysis of variance of regression models.


Figure 6 illustrates the correlation between the predicted value and the actual value of the removal efficiency in the response surface model. The experimental data points in the figure were more distributed on the straight line or on both sides of the straight line, indicating a good fit of the model. The summary of the fitted output report showed that the quadratic response model was suitable for explaining the relationship between the pollutant variables, so it could be used to analyze and optimize the effect of iron and manganese ions in water on the adsorption system.

**Figure 6.** Residual plots of Fe and Mn ions (**a**) and plots of predicted vs. actual values (**b**).

#### *3.5. Analysis of Interaction between Factors*

The response surface method overcomes the problem with orthogonal experiments, namely that they cannot give intuitive graphics. According to the fitted quadratic equation model, the response surfaces between different test factors can be plotted. Two-dimensional and three-dimensional response surface maps can better explain independent variables and interaction effects [29]. The shape of the contour can reflect the size of the interaction. A circle indicates that the interaction between the two factors is not apparent, and an ellipse indicates that the interaction is obvious. In other words, the greater the flattening degree of the ellipse, the more significant the interaction between the two factors [30]. Using this method, we analyzed and evaluated the effect of manganese sand on ion removal according to the interaction between any two factors. An additional factor was controlled at the intermediate level when discussing the influence of the interaction on the removal rate. The 3D surface plots of the independent and dependent variables are illustrated in Figures 7 and 8.

#### 3.5.1. Interaction of Three Factors in Iron Ion Removal

It can be seen from Figure 7a–c that when the pH was 7.1–7.3, the dosage of manganese sand was 3.4–3.6 g/L; that is, the initial concentration ratio of iron and manganese ions was between 3.5 and 4, the 3D surface had the deepest chromaticity, and the iron ion removal effect was better. It can be seen from the figure that the removal rate of iron ions could be significantly improved by increasing pH, which indicated that the change in pH significantly affected the whole adsorption system during the process of the adsorption of iron ions by manganese sand. The figure shows that with the increase in pH, the dosage of manganese sand, or the concentration of iron and manganese ions, the degradation rate of iron ions first increased and then decreased. Therefore, there was an optimal matching value between the different influencing factors, and appropriate experimental conditions can maximize the economic benefits of the model and effectively improve the iron ion removal rate.

Figure 7a shows the interaction between the pH value and the manganese sand dosage and their effect on the iron ion removal rate when the initial concentration ratio of iron to manganese was 4. It can be seen from the figure that the two-dimensional contour line was oval, indicating that the interaction between the pH and the dosage was significant. Figure 7b shows the interaction between the pH and the initial concentration ratio of iron and manganese when the dosage was fixed at 3.5 g/L. The steepness of the response surface in the figure was not as significant as the above combination, indicating that the interaction between the pH value and the initial concentration ratio of iron and manganese was not as significant as the interaction between the above pH value and the dosage of manganese sand. The slope of the pH in the response surface graph was steep, indicating that the pH value had a greater impact on the iron removal rate than these two factors. The interaction between the manganese sand dosage and the initial concentration ratio of iron and manganese is shown in Figure 7c. The steepness of the response surface in the figure was not obvious, indicating that the interaction between the dosage of manganese sand and the initial concentration ratio of iron and manganese was not significant.

**Figure 7.** The 3D response surface diagram of manganese sand dosage and pH (**a**), initial concentration ratio of Fe/Mn and pH (**b**), initial concentration ratio of Fe/Mn and manganese sand dosage (**c**) influence iron ion removal.

**Figure 8.** The 3D response surface diagram of manganese sand dosage and pH (**a**), initial concentration ratio of Fe/Mn and pH (**b**), initial concentration ratio of Fe/Mn and manganese sand dosage (**c**) influence manganese ion removal.

3.5.2. Interaction of the Three Factors in Manganese Removal

The insignificant ellipticity of the contours in Figure 8a indicated that the interaction between the pH and the dosage had no significant effect on the manganese ion removal rate. When the initial concentration ratio of Fe/Mn was fixed at 4, the slope of the pH on the response surface was very steep, indicating that the pH value had a more obvious influence on manganese ion removal in these two factors. Under the condition of a fixed dosage, the removal efficiency of manganese ions changed with the increase in the pH value. It can be seen from the response surface model that, with the increase in the pH value and the dosage of manganese sand, the removal of Mn2+ by manganese sand first increased and then decreased. When the dosage of manganese sand was 3.4–3.6 g/L, the pH value remained between 6.9 and 7.3, and the removal rate of manganese ions was within the maximum range.

As can be seen from the contour lines in Figure 8b, there was an interaction between the pH value and the initial concentration ratio of Fe/Mn. The slope of the pH value in the response surface was large, which means that the influence of the pH value on the experimental adsorption process was more significant. With the increase in the pH value from 6.5 to 7.5, the removal of Mn2+ by manganese sand first increased and then decreased. When the concentration ratio of Fe/Mn was 3.5–4.5, and the pH value was 6.9–7.3, the removal rate of manganese ions by manganese sand was within the maximum range.

Figure 8c shows the interaction model of the manganese sand dosage and the concentration ratio of iron and manganese ions at pH = 7. It can be seen that the response surface had an "arch shape", indicating that the interaction between these two factors in the manganese ion removal process was very significant. When the initial concentration ratio of Fe/Mn was constant, the removal of manganese ions first increased and then decreased with the increase in dosage. The slope reflecting the dosing amount in the response surface plot was larger, indicating that when considering the interaction between these two factors, the dosage significantly affected the removal of manganese. When the amount of manganese sand was controlled at 3.5–3.6 g/L, and the initial concentration ratio of Fe/Mn was controlled at 3.5–4, the removal effect of manganese ions was the best.

#### *3.6. Validation Experiment*

From the response surface analysis, it can be seen that there was an efficient combination of operating parameters for the complex interaction among the three influencing factors. Through the optimization function of the Design-Expert software, the optimal parameters of the reaction system were predicted as follows: pH = 7.20, the dosage of manganese sand was 3.54 g/L, and the initial concentration ratio of iron and manganese was 3.80. Under this optimal condition, the predicted iron ion removal rate was 83.62%, and the manganese ion removal rate was 76.10%. In order to verify the prediction results, three groups of parallel experiments were carried out in the same reaction system. The average removal rate of iron ions was 82.78%, and the average removal rate of manganese ions was 75.89%, both of which were close to the prediction values, and the relative deviation was less than 1%. These results show that the model can truly reflect the influence of the various analyzed factors on iron and manganese ion removal rate.

#### *3.7. Analysis of Adsorption Kinetic Model*

Figure 9 shows the adsorption kinetic curves reflecting the adsorption of iron and manganese ions by manganese sand. The trends of the adsorption rate curves of the two models were basically similar. During the initial 240 min of the adsorption process, manganese sand had many adsorption sites, a large curve slope, and a fast adsorption rate. With the progress in adsorption, the adsorption sites on the surface of manganese sand were gradually occupied by iron and manganese ions, and the adsorption rate slowly decreased until the adsorption equilibrium was reached at 480 min.

**Figure 9.** The adsorption kinetic model of iron (**a**) and manganese (**b**) ions.

The quasi-primary and quasi-secondary kinetic fitting parameters are shown in Tables 5 and 6. In the adsorption of iron and manganese ions by the unmodified manganese sand and modified manganese sand, the fitting regression coefficients of the quasi-firstorder kinetic model were less than those of the quasi-second-order kinetic model, so it can be described by the quasi-second-order kinetic model. Compared with the unmodified manganese sand, the theoretical adsorption capacity of the modified manganese sand was improved. This was because the specific surface area of the modified manganese sand increased, and the adsorption point increased, which was conducive to increasing the reaction residence time of the adsorbent and the adsorbate, and at the same time, increasing the equilibrium adsorption capacity.


**Table 5.** Kinetic fitting parameters of the adsorption of iron ions by manganese sand.


#### *3.8. Analysis of Adsorption Isotherm Model*

We used the Langmuir and Freundlich models to fit the adsorption process. The thermodynamic model parameters are shown in Table 7. The Langmuir and Freundlich isotherm models showed a good linear relationship in Figure 10. It can be seen from Table 7 that R<sup>2</sup> <sup>L</sup> was greater than R2 F. Obviously, the determination coefficient of the Langmuir isotherm equation was closer to 1, and the fitting effect was good. The adsorption of the ions by manganese sand was a single-layer surface chemical adsorption. In the quasisecond-order kinetic model, the adsorption reaction was assumed to be a single-layer adsorption system, and the adsorption mechanism was chemical adsorption, which was consistent with the fitting result of the adsorption isotherm.

**Figure 10.** Adsorption isotherm models of iron (**a**) and manganese (**b**) ions.


**Table 7.** Constants of Langmuir and Freundlich isotherms.

#### **4. Conclusions**

In this study, we investigated the interaction of three variables (pH, the manganese sand dosage, and the initial concentration ratio of Fe/Mn) on the removal of iron and manganese ions using RSM. We first determined the optimal experimental range of the three factors through single-factor experiments. ANOVA explained the significance of the factors, and the results proved the accuracy of the model. To determine the interactions between the input variables, we developed a three-dimensional response surface. The isothermal model of the adsorption process and the adsorption mechanism were also investigated.


**Author Contributions:** Conceptualization, H.K. and Y.L.; methodology, D.L.; software, D.L.; validation, H.K. and L.X.; formal analysis, L.X.; investigation, H.K.; resources, Y.L.; data curation, L.X.; writing—original draft preparation, D.L. and Y.L.; writing—review and editing, H.K.; visualization, Y.L.; supervision, H.K.; project administration, H.K.; funding acquisition, H.K. and L.X. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Department of Science and Technology of Liaoning province (Grant No. 20170520224).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Water Quality Evaluation and Prediction Based on a Combined Model**

**Guimei Jiao 1,\*, Shaokang Chen 1, Fei Wang 1, Zhaoyang Wang 2, Fanjuan Wang 1, Hao Li 1, Fangjie Zhang 1, Jiali Cai <sup>1</sup> and Jing Jin <sup>1</sup>**


**Abstract:** Along with increasingly serious water pollution, water environmental problems have become major factors that hinder the sustainable development of our economy and society. Reliable evaluation of water quality and accurate prediction of water pollution indicators are the key links in water resource management and water pollution control. In this paper, the water quality data of Lanzhou Xincheng Bridge section in the Yellow River Basin and Sichuan Panzhihua Longdong section in the Yangtze River Basin were used to establish a water quality evaluation model and a prediction model. For the water quality evaluation model, we constructed the research samples by means of equal intervals and uniform distribution of interpolated water quality index data according to *Environmental Quality Standards for Surface Water*. The training samples were determined by a stratified sampling method, and the water quality evaluation model was established using a T-S fuzzy neural network. The experimental results show that the highest accuracy achieved by the evaluation model in water quality classification was 94.12%. With respect to the water quality prediction model, we propose ARIMA-WNN, which combines the autoregressive integrated moving average model (ARIMA) and a wavelet neural network (WNN) with the bat algorithm (BA) to determine the optimal weight of each individual model. The experimental results show that the highest prediction accuracy of ARIMA-WNN is 68.06% higher than that of the original model.

**Keywords:** water quality evaluation; water quality forecasting; T-S fuzzy neural network; combined model

#### **1. Introduction**

Water is the foundation of human existence and the driving force for social stability and a nation's prosperity. However, water resource management has been ignored and forgotten for a long time. It was not until the mid-19th century that, due to the rapid development of industry, water pollution became increasingly serious and water resource management became increasingly prominent. [1]. Since then, the declining water quality of rivers, lakes and groundwater has become a global problem. Although an increasing number of countries has begun to attach importance to water resources and implement a series of protection measures for the sustainable development of water resources, the water resources environment is still deteriorating, with increasing pollution and waste caused by economic development, the acceleration of urbanization and population growth.

The river pollution situation is serious. The water quality level has fallen to IV or worse in 31.4% of the more than 208,000 km of managed river sections in China and below class V in 14.9% of managed sections, indicating that water resources have completely lost their potential for daily use [2]. Of the ten major river basins in China, only some in the southwest and northwest have moderate water quality (categories I to III), and the major river systems in the north, such as the Yellow River, Liao River and Huai River, are rated IV or V. The declining self-purification ability of rivers and deteriorating industrial wastewater

**Citation:** Jiao, G.; Chen, S.; Wang, F.; Wang, Z.; Wang, F.; Li, H.; Zhang, F.; Cai, J.; Jin, J. Water Quality Evaluation and Prediction Based on a Combined Model. *Appl. Sci.* **2023**, *13*, 1286. https://doi.org/10.3390/ app13031286

Academic Editor: Chang-Gu Lee

Received: 10 December 2022 Revised: 30 December 2022 Accepted: 14 January 2023 Published: 18 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

management have further worsened the water quality of small tributaries flowing into the major rivers of our country.

Lakes are also heavily polluted. The water quality of nearly half of the 62 key lakes in the country is of grades IV and V or inferior. The three major lakes in China, Taihu Lake, Chaohu Lake and Dianchi Lake, are polluted to varying degrees, in states of mild, moderate and severe pollution, respectively, with total phosphorus and chemical oxygen demand representing the main pollutants.

The groundwater quality situation is also worrying. In China's major cities, 27 percent of centralized drinking water sources do not meet official standards. Among the 5118 groundwater monitoring points in various provinces and cities across the country, the proportion of poor and extremely poor water quality is more than half, threatening people's daily water use [3,4].

Water environmental problems have become a major factor hindering the sustainable development of China's economy and society, and the effective treatment of water pollution and the rational management of water resources are urgent problems to be solved. The accurate prediction of water quality indicators and the reliable evaluation of water quality grades are the basis for understanding the current water quality status and taking corresponding protection measures, so water quality prediction and evaluation have great practical significance.

In this paper, we take the water quality of Lanzhou Xincheng Bridge section in Yellow River Basin and Longdong section in Yangtze River Basin as the research object, establish a water quality evaluation model and propose a new water quality prediction model.

A T-S fuzzy neural network was used to establish the evaluation model combined with the relevant water quality information of the two basins. In the process of model training, an innovative method of interpolating water quality index data with equal intervals and uniform distribution was adopted to construct research samples, and the method of stratified sampling was used to construct training samples. The trained model was applied to water quality evaluation of Lanzhou Xincheng Bridge section in the Yellow River Basin and Longdong section in the Yangtze River Basin. A total of 52 groups were randomly selected from the real water quality index data from 2004 to 2015, and the results of water quality status were output and compared with the real water quality grade to prove the effectiveness and generalizability of the evaluation model.

Furthermore, a new model is proposed for water quality prediction, which combines the autoregressive integrated moving average (ARIMA) model and the wavelet neural network (WNN) mode with the bat algorithm to determine the optimal weight of each individual model. The combined model was used to predict the water quality indices of Lanzhou Xincheng Bridge section in the Yellow River Basin and Longdong section in the Yangtze River Basin. First, 624 weekly monitoring data points of each indicator from 2004 to 2015 were used as the training set, and 52 data points from 2016 were used as the validation set. ARIMA and WNN were used for prediction. The empirical mode decomposition (EMD) algorithm was used to denoise the data before WNN prediction [5]. Secondly, the bat algorithm was used to determine the optimal weight; the final prediction result was the weighted sum of the prediction results of ARIMA and WNN. Then, the prediction results of the combined model were compared and analyzed relative to the prediction results of three individual models (backpropagation (BP), neural network and least squares support vector machine (LSSVM)) to prove the ability of the proposed combined model for water quality prediction [6,7]. Finally, the prediction results of each index in 2016 were substituted into the previously established water quality evaluation model, and the output results had a high coincidence rate with the real water quality grade, which further verified the effectiveness of the evaluation model.

#### **2. Literature Review**

#### *2.1. Research Status of Water Quality Evaluation*

At present, multivariate statistical methods are widely used in water quality evaluation and analysis abroad. Singh K P et al. (2005) applied multivariate statistical methods to the water quality evaluation of eight monitoring sites of the Gomti River in India from 1999 to 2001, demonstrating the advantages of multivariate statistical methods in processing and evaluating a large number of complex water quality datasets and obtaining effective water quality evaluation results [8]. Shrestha S et al. (2007) collected a total of 14,976 data points measuring 12 water quality indicators at 13 different monitoring points from 1995 to 2002 using multivariate statistical tools for spatiotemporal variable analysis of a large set of complex water quality data of the Fuji River. The water quality status of the 13 observation points was divided into three categories by stratified cluster analysis: mild, moderate and severe pollution [9]. Zhang X et al. (2011) used the monthly data of 23 indicators from 16 different monitoring points in southwest Kowloon, Hong Kong, from 2000 to 2007, employing hierarchical cluster analysis to divide the 12 months into two periods and the 16 monitoring points into three categories. Discriminant analysis provides analysis results from both spatial and temporal aspects. Among the 23 indicators, 4 are the main factors affecting the temporal distribution, and 8 indicators are the main factors affecting the spatial distribution [10]. Ogwueleka T C (2015) used principal component analysis, cluster analysis and factor analysis to study the water quality of the Kaduna River and analyze the potential pollution factors of the river [11]. The multivariate statistical method is a classical method for water quality assessment and management, but it cannot provide comprehensive information about water quality. On this basis, in this study, the selected neural network constantly updates and iterates the model parameters, adjusts the weights and thresholds to the state that can output the optimal results and applies the trained model to the evaluation task.

#### *2.2. Research Status of Water Quality Prediction*

Previously proposed water quality prediction models are based on qualitative analysis. Water quality prediction was first studied in 1925, when Phelps and Streeter proposed the S-P model to track BOD-DO changes in water quality. Since then, with the worsening global water pollution and drinking water crises, an increasing number of water quality prediction models have been proposed. However, due to the complexity of water environments, it is difficult to obtain accurate prediction results with traditional mathematical models, so scholars began to use neural networks to predict water quality. Singh KP et al. (2009) used the monthly data of 11 water quality indicators from 8 different monitoring points over 10 years to establish two BP neural network models, 11-23-1 and 11-11-1, to calculate the levels of dissolved oxygen and biochemical oxygen demand of Gomti River in India and indirectly judge the water quality [12]. Seo IW et al. (2016) used an artificial neural network model to predict eight water quality indicators downstream of Cheongpyeong Dam [13]. Zhang ran et al. (2013) established a GM(1,1) model to predict the water quality of the Yellow River estuary from 2012 to 2015 [14]. Zhang Ying et al. (2015) took the section of Shanghai Qingpu Urgent Water Port in Taihu Lake Basin as an example and applied the gray model after residual correction of an extreme learning machine regression model for the prediction of water quality indicators. They used the data of the first 100 days of 2013 of six water quality indicators, including dissolved oxygen and chemical oxygen demand, to predict the data of the 101st to 110th days [15]. Xu Hongmin et al. (2007) proposed a weighted support vector regression model to predict the concentration of permanganate in Taihu Lake Basin using the same method; the results showed that the prediction accuracy of this algorithm was higher than that of SVM and RBF neural network alone [16]. According to the idea of neural network weighting introduced in the abovementioned literature, in this study, we designed and implemented a mixed model to predict water quality.

#### **3. Materials and Methods**

*3.1. T-S Fuzzy Neural Network*

T-S fuzzy neural network is a new fuzzy neural network proposed by Takagi and Sugeno in 1985 [17]. Fuzzy reasoning rules are adopted in '*if then* form:

$$R^i: If \ge \mathbf{x}\_1 \text{ is } A^i\_{1'}, \mathbf{x}\_2 \text{ is } A^i\_{2'}, \dots, \mathbf{x}\_k \text{ is } A^i\_k \text{ then } \\ y\_i = p^i\_0 + p^i\_1 \mathbf{x}\_1 + \dots + p^i\_k \mathbf{x}\_k \tag{1}$$

where '*if then* are the front part and back part of fuzzy rules, respectively (the former is the input part of fuzzy rules, and the latter is the determined output); *A<sup>i</sup> <sup>j</sup>* is a fuzzy set of fuzzy models; *p<sup>i</sup> j* (*j* = 1, 2, . . . , *k*) is a fuzzy model parameter; and *yi* is the output of the fuzzy rules. The process shows that the output is a linear combination of inputs, as shown in Figure 1.

**Figure 1.** Structural diagram of T-S fuzzy neural network.

#### *3.2. ARIMA*

The autoregressive integrated moving average (ARIMA) model was proposed by Box and Jenkins in the 1970s [18]. It is based on the autoregressive model (AR) proposed by Yule in 1927 and the combination of the moving average model (MA) and autoregressive moving average model (ARMA) with AR and MA proposed by Walker in 1931 [19–21]. In this model, the future value of a variable is considered a linear combination of past values and past errors:

$$y\_t = \theta\_0 + \varphi\_1 y\_{t-1} + \varphi\_2 y\_{t-2} + \dots + \varphi\_p y\_{t-p} + \varepsilon\_t - \theta\_1 \varepsilon\_{t-1} - \theta\_2 \varepsilon\_{t-2} - \dots - \theta\_q \varepsilon\_{t-q} \tag{2}$$

$$
\varphi(B)\nabla^d y\_t = \theta(B)\varepsilon\_t \tag{3}
$$

where *yt* is the actual value; *ε<sup>t</sup>* is the random error at time *t*; *ϕ<sup>i</sup>* and *θ<sup>j</sup>* are the coefficients; *p* and *q* are the orders of autoregressive and sliding average polynomials, respectively; *B* represents the lag operator, where Δ*<sup>d</sup>* = (1 − *B*) *d* ; *d* is the number of differences; and *ϕ*(*B*) and *θ*(*B*) are defined as:

$$\varphi(B) = 1 - \varphi\_1 B - \varphi\_2 B^2 - \dots - \varphi\_p B^p \tag{4}$$

$$\theta(B) = 1 - \theta\_1 B - \theta\_2 B^2 - \dots - \theta\_q B^q \tag{5}$$

#### *3.3. Wavelet Neural Network*

Wavelet neural network (WNN) is a neural network model that combines wavelet transform and artificial neural network, which replaces the excitation function of the traditional neural network with the wavelet basis function. It was first proposed in 1992 by Zhang Q and Benveniste A of LRISA, a famous French information science institute [22]. The combination of wavelet transform and neural network provides unique advantages. In recent years, it has been widely used in nonlinear function approximation, dynamic modeling and non-stationary time series prediction. *X*1, *X*2, ... , *Xk* are input parameters, *ωij* and *ωjk* are connection weights and *Y*1, *Y*2, ... , *Ym* are predicted outputs. The formula for calculating the output of the hidden layer is:

$$h(j) = h\_j \left[ \frac{\sum\_{i=1}^k \omega\_{ij} x\_i - b\_j}{a\_j} \right], j = 1, 2, \dots, l \tag{6}$$

where *hj* is the wavelet basis function; *aj* and *bj* are the scale factor and translation factor of the wavelet basis function, respectively; *ωij* is the connection weight between the input layer and the hidden layer; and *h*(*j*) is the output value of the node of the seventh hidden layer. The calculation formula of the output layer is:

$$y(k) = \sum\_{j=1}^{l} \omega\_{jk} h(j) \; \; k = 1, 2, \dots, m \tag{7}$$

where *ωjk* is the connection weight between the hidden layer and the output layer, and *y*(*k*) is the output value.

#### *3.4. ARIMA-WNN*

Actual time series data often have both linear and nonlinear characteristics, and ARIMA or WNN alone cannot reflect the dual linear and nonlinear characteristics of time series. In order to simultaneously utilize the good linear fitting ability of the differential autoregressive moving average model and the powerful nonlinear relationship mapping ability of the wavelet neural network model, we combine ARIMA and WNN methods.

Assuming *yt*(*t* = 1, 2, . . . , *L*) is the actual value of the time series, L is the number of sample points, and *y*ˆ*<sup>t</sup>* and *y*ˆ*it*(*i* = 1, 2, . . . , *N*, *t* = 1, 2, . . . , *L*) is the predicted value of the combined model and the first single method, respectively; then:

$$\mathcal{Y}\_t = \sum\_{i=1}^N \lambda\_i \mathcal{Y}\_{it} \tag{8}$$

where *λ<sup>i</sup>* is the weight of the prediction method, and ∑*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> *λ<sup>i</sup>* = 1. The weight coefficients of each of the component models in the combined model are determined by solving the following optimization problems:

$$\underset{t=1}{\text{Min}} \; \sum\_{t=1}^{L} \left( y\_t - \hat{y}\_t \right)^2, \quad \text{s.t.} \; \sum\_{i=1}^{N} \lambda\_i = 1, \; 0 \le \lambda\_i \le 1 \tag{9}$$

On this basis, we propose a new combination model composed of the ARIMA model and WNN [23]. The optimal weight of a single model is obtained by the bat algorithm, and the predicted value of the combination model is expressed as follows:

$$P\_{\Sigma} = \lambda\_1 P\_{\text{WNN}} + \lambda\_2 P\_{ARIMA} \tag{10}$$

where *P*<sup>Σ</sup> is the final predicted value; *λ*1, *λ*2, *PWNN*, *PARIMA* are the weight coefficients and predicted values of WNN and ARIMA models, respectively; and *λ*<sup>1</sup> + *λ*<sup>2</sup> = 1, as shown in Figure 2.

**Figure 2.** ARIMA—WNN structure.

#### *3.5. Bat Algorithm*

The bat algorithm (BA) is a swarm intelligence optimization algorithm that simulates the behavior of bats to hunt down prey, which was proposed by Yang X S in 2010 [24]. Let the bat search for prey in the *n* dimensional space at the moment of *t* − 1; the flight speed and position of bat *i* are *vt*−<sup>1</sup> *<sup>i</sup>* , *<sup>x</sup>t*−<sup>1</sup> *<sup>i</sup>* , respectively; then, the update rules of the bat's flight speed (*v<sup>t</sup> i* ) and location at time *t* are:

$$f\_i = f\_{\min} + (f\_{\max} - f\_{\min}) \ast \beta \tag{11}$$

$$v\_i^t = v\_i^{t-1} + \left(\mathbf{x}\_i^{t-1} - \mathbf{x}^\*\right) \* f\_i \tag{12}$$

$$\mathbf{x}\_{i}^{t} = \mathbf{x}\_{i}^{t-1} + \mathbf{v}\_{i}^{t} \tag{13}$$

Among them, *β* ∈ [0, 1] is a uniformly distributed random number; *fmax* and *fmin* are the maximum and minimum search frequency, respectively; the bat's pulse search frequency is *fi*; and *fi* ∈ [ *fmin*, *fmax*]. The bat algorithm controls the prey hunting range of bats by adjusting *fi* and controls the global search in the whole updating process, which is the optimal solution for the current bat population.

For a local search, the bat algorithm is completed by random disturbance. Each bat randomly selects one solution from the current optimal solution set as the current optimal solution (*xold*); then, we use the following formula to update the position to obtain a new solution:

$$
\mathfrak{x}\_{mw} = \mathfrak{x}\_{old} + \varepsilon A^t \tag{14}
$$

where *<sup>ε</sup>* ∈ [−1, 1] is a random number, and *<sup>A</sup><sup>t</sup>* is the average loudness of all bats at time *<sup>t</sup>*.

*3.6. Materials*

(1) ADF

The ADF test determines whether the series is stationary by checking whether the sum of the autoregressive coefficients is 1 [25]. Compared with the DF test, which can only be used to determine whether the AR(1) model is stationary, the ADF test can be used to determine the stationarity of the AR(P) model. The hypothesis test is established as:

$$H\_0: \rho = 0$$

$$H\_1: \rho < 0$$

The test statistic is:

$$\pi = \frac{\rho}{S(\delta)}\tag{15}$$

where *ρ* = *ϕ*<sup>1</sup> + Λ + *ϕ<sup>p</sup>* − 1, and *S*(*ρ*ˆ) is the sample standard deviation of parameter *ρ*.

#### (2) AF

The autocorrelation function describes the correlation between the random sequence at time *t* and the value at time *t* − *k* [26].

$$\rho\_k = \frac{Cov(y\_t, y\_{t-k})}{\sqrt{Var(y\_t)}\sqrt{Var(y\_{t-k})}} = \frac{E[(y\_t - \mu)(y\_{t-k} - \mu)]}{\sqrt{Var(y\_t)}\sqrt{Var(y\_{t-k})}} = \frac{\gamma\_k}{\sigma^2} \tag{16}$$

where *<sup>E</sup>*(*yt*) = *<sup>μ</sup>*, *<sup>σ</sup>*<sup>2</sup> = *<sup>E</sup>*(*yt* − *<sup>μ</sup>*) 2 , and *γ<sup>k</sup>* denotes the covariance of *yt* and *yt*−*k*. (3) PAF

The partial autocorrelation function shows that for sequence *yt* in determining *yt*−1, *yt*−2,..., *yt*−*k*<sup>+</sup>1, the correlation between *yt* and *yt*−*<sup>k</sup>* is denoted by *ϕkk* [26]:

$$\varphi\_{kk} = \begin{cases} \gamma\_1 & k = 1\\ \frac{\gamma\_k - \sum\_{j=1}^{k-1} \varphi\_{k-1,j} \gamma\_{k-j}}{1 - \sum\_{j=1}^{k-1} \varphi\_{k-1,j} \gamma\_j} & k = 2, 3, \dots \end{cases} \tag{17}$$

#### (4) The AIC criterion

The AIC criterion was proposed by Japanese statistician Hiroji Akike as a measure of model fit excellence [27].

$$AIC = n \log \sigma^2 + 2(p+q) \tag{18}$$

where *n* is the number of samples; *σ*<sup>2</sup> is the sum of squares of the fitted residuals; and *p* and *q* are the orders of the AR and MA models, respectively. Models are established from low-order to high-order according to the ARIMA values, and the AIC value of each model is calculated. According to the criterion, the model with the lowest AIC value is the optimal model.

(5) MAE

Mean absolute error (MAE) is a measure of accuracy for regression [28]. It sums up absolute values of errors and divides them by the total number of values. It gives equal weight to each error value. The formula for calculating MAE is shown in Equation (19).

$$MAE = \frac{\sum \left( |\phi\_i - y\_i| \right)}{n} \tag{19}$$

#### (6) MAPE

The reason why mean absolute percentage error can describe accuracy is that it is often used as a statistical index to measure the accuracy of prediction [29].

$$MAPE = \frac{1}{n} \sum \left( \left| \frac{\wp\_i - y\_i}{y\_i} \right| \right) \* 100\% \tag{20}$$

#### (7) RMSE

Root mean squared error (RMSE) is the square root of MSE and scales the values of MSE to the ranges of observed values [28]. It is estimated according to Equation (21).

$$RMSE = \sqrt{\frac{\sum \left( |\mathcal{G}\_i - y\_i| \right)^2}{n}} \tag{21}$$

#### (8) AI

In this paper, a new index (AI, accuracy improvement) is introduced to show the improvement of prediction ability of the combined model compared with the single model [30]. *Si* and *Sc* are the MAPE of the single model and combined model, respectively.

$$RI = \frac{|S\_i - S\_{i^\*}|}{S\_i} \times 100\% \tag{22}$$

#### **4. Results and Discussion**

*4.1. Water Quality Evaluation Model*

#### 4.1.1. Data Preprocessing

The three water quality monitoring indicators used in this paper are from the data center of the Ministry of Environmental Protection (http://datacenter.mep.gov.cn/index (accessed on 1 September 2021)): dissolved oxygen (DO), chemical oxygen demand permanganate (CODMn) and ammonia nitrogen (NH3-N). The corresponding water quality grades of each index value are shown in Table 1:


**Table 1.** Environmental quality standards for surface water (GB3838-2002).

In order to solve the problem of inadequate sample size due to only taking the water quality evaluation grading standard as the research sample, the water quality index data are interpolated by the method of equal interval and uniform distribution [31]. For the convenience of modeling, the output value is continuous, and its value range is (0.5,6.5). The relationship between output value and water quality grade is shown in Table 2.

**Table 2.** Corresponding water quality grade of output values.


#### 4.1.2. Model Building Process

In this study, T-S fuzzy neural network is used to evaluate water quality. The number of input and output nodes of the model is determined by the input and output dimensions of training samples. According to the index data considered in this paper, the input and output dimensions are determined to be three and one, respectively, so the number of input and output nodes is three and one, respectively. Through trial-and-error method, the number of hidden layer nodes is determined to be six, so the structure of water quality evaluation model is as follows: 3–6–1 [32]. The four coefficients (P0, P1, P2 and P3), the width (b) and center (c) of the membership function were randomly initialized.

After normalizing the input data, training of each parameter in the model was started [33]. After 100 iterations, the network finally converged, and the training error was 3.46 \* 10<sup>−</sup>4.

4.1.3. Water Quality Evaluation of Lanzhou Xincheng Bridge Section

The process of water quality evaluation is realized in MATLAB, and the confusion matrix and water quality evaluation chart are output, as shown in Figure 3 [34]:


**Figure 3.** Water quality grade confusion matrix of Lanzhou Xincheng Bridge section.

In classification models, confusion matrices are often used to observe and judge the number of correct and incorrect classifications. As shown in Figure 3, among the 52 weeks of random sampling, correct in water quality evaluation was achieved for 47 weeks, and the total correct judgment rate reached 90.38%. The correct prediction rate of class II and class III water quality was 91.11% and 85.71%, respectively. Class II water quality was misjudged as class I for one week, class II water quality was misjudged class as III for three weeks, and class III water quality was misjudged class II for one week.

The total number of correct and incorrect judgments and the corresponding water quality grade can be determined according to the above analysis, but it is not clear in which week the incorrect judgment occurred. This information can be obtained from the water quality evaluation chart presented below.

In Figure 4, the *x*-axis indicates the number of weeks; the *y*-axis indicates the water quality evaluation; and the green dot and the red dots represent the real water quality grade and the water quality grade determined by the model, respectively. Overlapping of the red and green dots indicates that the water quality level is correctly judged for that week, whereas divergence indicates an incorrect judgment. As shown in Figure 4, the water quality grade is concentrated in class II and class III, which is not accidental because during the whole time period from 2004 to 2015, the water quality grade is mainly class II and class III. Five of the fifty-two water quality grades were incorrectly judged; details are shown in Table 3.

**Figure 4.** Water quality evaluation of Lanzhou Xincheng Bridge section.



#### 4.1.4. Water Quality Evaluation of Longdong Section

Similarly, the trained T-S fuzzy neural network is applied to the water quality evaluation of Longdong section in Panzhihua, Sichuan, in the Yangtze River Basin, with a confusion matrix and water quality evaluation map as output, as shown in Figures 5 and 6, respectively.


**Figure 5.** Confusion matrix of water quality grades in the Longdong section.

**Figure 6.** Water quality evaluation of Longdong section.

According to the confusion matrix, the water quality grade was correctly judged for 46 weeks and incorrectly judged for 6 weeks; the total correct judgment rate reached 88.46%. The correct judgment rates of class I and II water quality were 94.12% and 82.35%, respectively. Class I water quality was misjudged as class II for two weeks, class II was misjudged as class I for three weeks and class III was misjudged as class II for one week.

As shown in Figure 6, the water quality was mainly classified as class I and class II for the 52-week investigation period. In the 12th and 17th weeks, water quality in class I was incorrectly classified as class II; in the 21st, 42nd and 48th weeks, the water quality in class II was incorrectly classified as class I; and in the 14th week, the water quality in class III was incorrectly classified as class II. Results are shown in Table 4:


**Table 4.** Misjudgment of water quality grade in Longdong section.

The water quality evaluation results presented above indicate that an ideal state was achieved in all investigated river basins. In the Yellow River in Lanzhou Xincheng Bridge section and Panzhihua, Sichuan province, in the Yangtze River Basin Longdong section, the water levels were correctly classified at rates of 90.38% and 88.46%, respectively, indicating that the trained T-S fuzzy neural network achieved satisfactory quality evaluation with good generalization ability.

#### *4.2. Water Quality Prediction Model*

4.2.1. Water Quality Prediction of Lanzhou Xincheng Bridge Section

For prediction with the ARIMA model, we used E-views10 software, which is created by IHS Global Inc in the State of California, United States. An original sequence diagram was generated to intuitively judge whether the data sequence was stable, as shown in Figure 7.

**Figure 7.** Original dissolved oxygen values.

In Figure 7, the *x*-axis shows the time in months, and the *y*-axis shows dissolved oxygen in milligrams per liter. The series in the figure does not indicate an obvious trend or seasonality, so it was preliminarily judged to be stationary series [35]. For further confirmation, an ADF unit root test was performed. The results show that the ADF test statistic is less than the critical value at levels of 1%, 5% and 10%, so the dissolved oxygen series is stationary and does not require differential processing [36]. An autocorrelation diagram and partial autocorrelation diagram were generated for model order determination, as shown in Figure 8.

As shown in Figure 8, the trailing and censoring characteristics of the autocorrelation function and partial autocorrelation function are very obvious in the dissolved oxygen data series. The decay of the autocorrelation function is very slow, which is a typical characteristic of trailing. However, the partial autocorrelation function rapidly decays to within two times the standard deviation after two steps, so the AR(2) model is determined. The estimation results of the model are shown in Table 5.

The R<sup>2</sup> of the model reached 79.34%, and an adaptability test was conducted on the AR(2) model was conducted, i.e., a white noise test of the residuals, as shown in Figure 9 [37]:

As shown in Figure 9, the autocorrelation and partial autocorrelation functions of the residual sequence both fall within two standard deviations, and the P value is significantly greater than 0.05. Therefore, it is a white noise sequence, indicating that the useful information in the original sequence has been extracted and the model has passed the adaptability test. Then, the AR(2) model established in this paper is used to predict the test data. A comparison between the predicted value and the real value is shown in Figure 10.


**Figure 8.** Autocorrelation and partial autocorrelation of dissolved oxygen.

**Table 5.** Parameter estimation of the AR(2) model.



**Figure 9.** White noise test of residual value.

**Figure 10.** ARIMA Prediction Diagram.

In Figure 10, the *x*-axis represents time, the *y*-axis represents the concentration of DO, the blue line represents the predicted value of the ARIMA model and the orange line represents the real value. We can intuitively observe that the coincidence degree of the two lines is relatively high, which indicates that the prediction effect of the ARIMA model is strong. Before using wavelet neural network for prediction, the irrelevant noise in the data is removed. The empirical mode decomposition (EMD) of the dissolved oxygen sequence was carried out by MATLAB, as shown in Figure 11:

**Figure 11.** EMD diagram of dissolved oxygen.

The original data series were decomposed into eight intrinsic mode functions (IMFs) and a residual sequence. The frequency decreases from IMF1 to IMF8, and each IMF has its own unique frequency and cycle [38]. Due to the high-frequency property of IMF1 and its chaotic fluctuation trend, it was removed on the basis of the original data sequence too obtain a new sequence after denoising. The original sequence was denoised as shown in Figure 12:

**Figure 12.** Comparison of dissolved oxygen sequence before and after denoising.

The denoised sequence is smoother than the original sequence, showing its original fluctuation trend more clearly. Therefore, the denoised sequence can be used for subsequent experimental demonstration and analysis.

First, a three-layer forward neural network is created, the parameters of each network are initialized and a training set is constructed to train the wavelet neural network. The data for weeks T-1, T-2, T-3 and T-4 are used to predict the number of neurons in week T. Therefore, the number of neurons in the input layer and output layer is four and one, respectively, and the number of neurons in the hidden layer is nine, as determined by the trial-and-error method; therefore, the structure of the wavelet neural network proposed in this paper is 4–9–1. The Morlet wavelet function is selected as the wavelet basis function, the number of iterations is set as 100, the learning probability is 0.001 and the training target is 10−6. The weight and parameters of the network are modified by gradient correction method so that the predicted output is close to the desired output [39]. The predicted result is output and compared with the real value, as shown in Figure 13.

**Figure 13.** WNN prediction diagram.

The bat algorithm is used to determine the coefficients of each model, and the process of determining the optimal weight is transformed into the process of hunting prey by bats; therefore, the combined prediction model of dissolved oxygen in Lanzhou Xincheng Bridge section of the Yellow River Basin is as follows:

$$P\_{DO} = 0.1495 P\_{WNN} + 0.8505 P\_{ARIMA}$$

The combined prediction model of permanganate and ammonia nitrogen was obtained according to the same steps:

$$P\_{COD\_{Mu}} = 0.7852P\_{WNN} + 0.2148P\_{ARIMA}$$

$$P\_{NH3-N} = 0.3274P\_{WNN} + 0.6726P\_{ARIMA}$$

4.2.2. Forecast Results and Analysis

The MAPE, MAE, RMSE of DO, CODMn and NH3-N in 2016 under the single model and combined model prediction, as well as the degree of improvement of the combined model compared with the single model prediction ability were obtained according to the modeling process described above, as shown in Tables 6 and 7, respectively.



**Table 7.** Improvement in the predictive ability of the combined model relative to the single model.


For the three water quality indicators, the MAPE, MAE and RMSE values of the combined model are lower than those of the single model, and the prediction effect is better. In the prediction of DO, CODMn and NH3-N, the prediction accuracy of the combined model is improved by 5.49% and 53.68%, 39.10% and 7.79%, and 14.40% and 20.40% compared with the ARIMA model and WNN, respectively.

For a particular set of data, a higher weight is assigned to a single model of the combined model, which indicates that the method has a better predictive ability. Taking DO as an example, the prediction accuracy of the ARIMA model is higher than that of WNN, and it has a better prediction ability. In the ARIMA-WNN model, the bat algorithm was used to calculate the weights of the two individual models as 0.8505 and 0.1495, respectively, meaning that the individual model with better prediction ability had more weight and proving the reliability of to the bat algorithm to determine the weight coefficient in the combined model. The combined model of water quality prediction proposed in this paper was compared with the prediction results of each individual model, the BP neural network and LSSVM. The results are shown in Figures 14–17:

As shown in the figures presented above, the MAPE, MAE and RMSE of the combined model are significantly lower than those of the comparison models. The fitting figures of the real and predicted values also show that the water quality prediction model proposed in this paper has a higher fitting degree to the real data.

**Figure 14.** Comparison of each model fitting for dissolved oxygen.

**Figure 15.** Comparison of each model fitting for CODMn.

**Figure 16.** Comparison of each model fitting for NH3-N.

4.2.3. Water Quality Prediction of Longdong Section

In order to verify the predictive ability of the combined model proposed in this paper for different water systems, the water quality index data of the Longdong section of Panzhihua in Sichuan Province of the Yangtze River Basin are selected for prediction and analysis. The modeling results are as follows:

$$P\_{DO} = 0.1044 P\_{WNN} + 0.8956 P\_{ARIMA}$$

$$P\_{COD\_{Mw}} = 0.0432 P\_{WNN} + 0.9568 P\_{ARIMA}$$

$$P\_{NH3-N} = 0.8646 P\_{WNN} + 0.1354 P\_{ARIMA}$$

The improvement in the prediction accuracy and combined model prediction ability of the single and combined models for each indicator are compared in Tables 8 and 9.

**Figure 17.** Comparison of the combined model with other models.

As shown in Table 8, the prediction effect of the combined model is better than that of each single model. For DO, CODMn and NH3-N, the combinatorial model outperformed the ARIMA model by 2.99%, 6.73% and 68.06%, respectively, in terms of prediction ability and outperformed the prediction of the WNN model by, 58.33%, 66.17% and 11.67%, respectively.

In the prediction of various indicators of the Longdong section of Panzhihua in Sichuan in the Yangtze River Basin, the bat algorithm still assigns more weight to the single model with better prediction ability and less weight to the single model with poor prediction ability. With respect to DO, the MAPE value predicted by the wavelet neural network is 4.68%, compared with 2.01% for the ARIMA model, which suggests that the ARIMA model achieves better predictive performed; therefore, in the combinatorial model, the bat algorithm assigns it a weight of 0.8956, whereas the WNN only assigns it a weight of 0.1044. In comparison with a BP neural network and LSSVM, the prediction accuracy of the combined model proposed in this paper is higher; the comparison results are shown in Figures 18–21.

In the prediction of various water quality indices in the Longdong section of Panzhihua, Sichuan, in the Yangtze River Basin, the prediction effect of the combined model is better and the accuracy is higher than that of the single model, the BP neural network and the LSSVM, indicating that the combined model proposed in this paper is suitable for water quality prediction in different river basins, with satisfactory generalization performance [40].


**Table 8.** Comparison of single and combined model prediction accuracy of each indicator.

**Table 9.** Improvement in the predictive power of the combined model compared to a single model.


**Figure 18.** Comparison of model prediction of dissolved oxygen.

**Figure 19.** Comparison of model prediction of CODMn.

**Figure 20.** Comparison of model prediction of NH3-N.

**Figure 21.** Combined model compared to other models.

4.2.4. Water Quality Evaluation in 2016

The results in presented in Section 4.1.3 prove that the trained T-S fuzzy neural network described in this paper has excellent ability in the determination of water quality grade. On this basis, the water quality index data of two sections predicted for 2016 were substituted for water quality evaluation. The results are shown in Tables 10 and 11 and Figures 22 and 23.


 2016.

**Figure 22.** Water quality evaluation of Lanzhou Xincheng Bridge section in 2016.

**Figure 23.** Water quality evaluation of Longdong section in 2016.

Water quality evaluation using the T-S fuzzy neural network trained as described in this paper revealed that among the 52 weeks of 2016, water quality was misjudged in the Lanzhou Xincheng Bridge section of the Yellow River Basin and the Longdong section of Panzhihua, Sichuan, the Yangtze River Basin for 7 and 10 weeks, respectively, with total correct judgment rates of 86.54% and 80.77%, respectively. Because the input data contain errors, this result is acceptable, verifying the reliability of the water quality evaluation model trained as described in this paper.

#### **5. Conclusions**

Water quality evaluation and prediction are not two completely independent procedures. On the contrary, they form a mutually dependent system and complement each other. Accordingly, in this study, we established a water quality evaluation–prediction system, established a water quality evaluation model using a T-S fuzzy neural network and constructed research samples by interpolating water quality index data evenly distributed on the basis of each index grading standard stipulated in the *Environmental Quality Standards for Surface Water*. A stratified sampling method is used to construct training samples. The trained T-S fuzzy neural network was applied to the water quality evaluation of the Lanzhou Xincheng Bridge section in the Yellow River Basin and the Longdong section in the Sichuan Panzhihua section in the Yangtze River Basin, with total positive water quality grade evaluation rates of 90.38% and 88.46%, respectively, indicating the positive water quality evaluation effect and generalizability of the model.

For the prediction of water quality, in this paper, we proposed a new combined model, ARIMA-WNN, which establishes the combined prediction model for each water quality index of the two basins and compares the prediction results with the combined model. The results show that compared with the single model, the combined model (ARIMA-WNN) has a higher prediction accuracy, and the prediction ability can be improved by up to 68.06%. Compared with commonly used water quality prediction models (BP neural network and LSSVM), we found that the MAPE, MAE and RMSE of the combined model are significantly lower, which demonstrates the excellent water quality prediction ability of the combined model.

Determining reasonable weight coefficients for each single model in the combined model is the basis for obtaining accurate prediction results, and in this study, we used the bat algorithm to achieve this process. Swarm intelligence optimization algorithms have developed rapidly in recent years, and a variety of new methods have emerged in succession [41]. Determining the optimal weight is a subject that can be studied in depth, and additional methods should be proposed and tested in subsequent work [42].

**Author Contributions:** Conceptualization, G.J.; Methodology, G.J.; Writing—original draft, F.W. (Fei Wang), F.W. (Fanjuan Wang), H.L., F.Z., J.C. and J.J.; Writing—review & editing, S.C.; Supervision, G.J. and Z.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research was supported by National Natural Science Foundation of China Under Grant No.41271038.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in the present study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
