**1. Introduction**

Surface water in rivers is a fundamental freshwater source, which plays an essential role in socio-economic development and the environment [1]. However, surface water bodies are under severe pressure because of exaggerated human activities, such as industrialization, urbanization, and population growth [2,3]. Additionally, poor management of water quantity and quality and climate change have reduced water quality during the past few decades, which leads to surface-water pollution [4,5]. Therefore, the evaluation and estimation of the water quality level in rivers are of great concern today.

The water quality index (WQI) has been extensively used to assess and classify the surface water and groundwater quality. This index by Brown et al. [6], is computed based on the physicochemical parameters of the water (e.g., temperature, pH, turbidity, dissolved oxygen (DO), biochemical oxygen demand (BOD), and concentrations of other pollutants), to estimate the level of water quality. The WQI provides quantitatively meaningful information to decision makers and planners for water resources management. However, the WQI formulations consist of lengthy calculations and thus require a lot of time and effort [5]. Additionally, the WQI formulations are inconsistent as these usually utilize

**Citation:** Khoi, D.N.; Quan, N.T.; Linh, D.Q.; Nhi, P.T.T.; Thuy, N.T.D. Using Machine Learning Models for Predicting the Water Quality Index in the La Buong River, Vietnam. *Water* **2022**, *14*, 1552. https://doi.org/ 10.3390/w14101552

Academic Editor: Karl-Erich Lindenschmidt

Received: 29 March 2022 Accepted: 10 May 2022 Published: 12 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

different equations [7]. Accordingly, to deal with the mentioned issues, it is absolutely vital to have an alternative approach for computationally efficient and accurate estimation of the WQI.

In recent years, machine learning (ML) techniques have been extensively used for river water quality assessment, including WQI estimation [8]. These techniques have proved to be powerful tools for modeling complex non-linear behaviors in water-resource research [9]. Our literature review demonstrates that each ML algorithm has its strengths and shortcomings, and its behavior is dependent on the input variables of water quality in the different study regions. Regarding the simulation and prediction of water quality, the capability of adaptive boosting (Adaboost) [10], gradient boosting (GBM) [11], extreme gradient boosting (XGBoost) [12], decision tree (DT) [13,14], extra trees (ExT) [4], random forest (RF) [10,15], multilayer perceptron (MLP) [16], radial basis function (RBF) [17], deep feed-forward neural network (DFNN) [18], and convolutional neural network (CNN) [19] has been reported. Although there are many ML algorithms, researchers are still being confronted with problems, including which ML techniques should be applied or most appropriate for a specific problem.

In Vietnam, the WQI proposal by the Ministry of Environment and Natural Resources (MONRE) [20] requires lengthy calculations and consequently demands a lot of time and effort. However, to the best of our knowledge, no study on the use of machine learning techniques in predicting the WQI has been conducted in Vietnam. Therefore, the present study aimed to assess the performance of twelve ML algorithms, consisting of five boostingbased algorithms (Adaboost, GBM, histogram-based gradient boosting (HGBM), light gradient boosting (LightGBM), and XGBoost), three decision tree-based algorithms (DT, ExT, and RF), and four ANN-based algorithms (MLP, RBF, DFNN, and CNN), in predicting the WQI of the La Buong River in Vietnam. The La Buong River is one of the important rivers that provides water supply for domestic, agricultural, and industrial usages in the southern key economic region of Vietnam.

#### **2. Study Area**

The La Buong River (10◦450–11◦000 N, 106◦500–107◦150 E), a tributary of the Dong Nai River, has a length of approximately 56 km and a basin area of 475.8 km<sup>2</sup> (Figure 1). The La Buong River Basin is located in the western part of the Dong Nai province in the southern key economic region of Vietnam. The topography of the basin ranges from 10 to 385 m above sea level. The basin has a tropical monsoon climate with two different seasons: a 6-month rainy season, lasting from May to October, and a 6-month dry season, lasting from November to April. The average annual temperature was 25.4 ◦C, the average annual rainfall was 1786 mm, and the average annual streamflow was 7.1 m3/s in the period 1981–2015 [21]. Rhodic Ferralsols and Ferric Acrisols are the main soils of the basin (accounting for approximately 75% of the basin area). More than 80% of land in the basin is utilized for agricultural development (cashew, coffee, and rubber). The La Buong River Basin is heavily influenced by cropping activities and livestock in the upper basin and industrial activities in the lower basin. Urbanization and industrial development are predicted to rise in the coming years [22].

**Figure 1.** The La Buong River and location of the WQ monitoring stations. **Figure 1.** The La Buong River and location of the WQ monitoring stations.
