1. Introduction
The circulating water system is part of the cooling utility system that uses water as a cooling medium to remove the waste heat and to recycle the water. This system is mainly composed of the cooler network, pump network, pipe network and cooling tower, and those elements’ operation quality is directly related to the long-term stable operation and economic benefit of the enterprise [
1]. The above networks are not independent but affect each other. For example, changes in the operating parameters or equipment connection mode will influence the water consumption, the pump efficiency, and the cooling tower power. Usually, the initial network is established based on the design conditions, but with the operation of the system, some parameters may deviate from the design values and lead to a reduction in the system’s overall efficiency. Thus, the establishment of simulation and optimization methods is of great importance for the improvement of the entire circulating water system.
Many studies have been conducted to improve the circulating water system and conserve energy. Among those studies, optimizing the network connection mode is an effective and popular method because of its low cost and high realizability. For example, Wang et al. [
2] proposed a two-step method to convert a parallel configuration of the cooler network to a series–parallel structure. The proposed modification could reduce the energy consumption of the system without changing the cooler structures. Ma et al. [
3] proposed a novel multi-loops pump network and an updated main-auxiliary pump network in a cooling water system. In this methodology, the cooler and pump networks, cooling tower and pipeline layouts were optimized simultaneously. Zhang et al. [
4] proposed a mathematical model based on the superstructure, under which the cooler pipe network and pump pipe network costs were reduced by 13% and 32%, respectively. Jose et al. [
5] used the pressure drop of each cooler as an optimization variable and established a mathematical model. Three examples were used to verify the feasibility of the method in order to find the optimal configuration of the circulating water system. Müller et al. [
6] used the MINLP model to address the complexity of pump system design, with the aim of minimizing life cycle costs. The results showed that a life cycle cost savings of up to 21% can be achieved while increasing the net efficiency from 47.2% to 57.8%.
With the development of artificial intelligence, increasing attention has been paid to the study of circulating water systems using machine learning methods. Barigozzi et al. [
7] developed a MATLAB algorithm to optimize the net electric power as a function of the way in which the condensation is operated. The optimization of the thermal cycle performance was achieved by fixing the flow rate, temperature and pressure of the steam entering the high-pressure turbine. Liang et al. [
8] proposed a MINLP model that considers the fouling in the pipeline, dynamic concentration cycle, and variable frequency drive to optimize the synergy among the heat transfer, pressure drop and fouling. By optimizing the concentration cycle of the circulating water system, water saving and scaling control can be achieved with significant energy/water saving effects. Song et al. [
9] proposed a Back-Propagation (BP) neural network to predict 638 sets of field test data in 36 different natural draft counterflow wet cooling towers (NDWCTs) in a power plant and developed a three-layer BP neural network model with a structure of 8-14-2. The results showed that the model has good prediction accuracy for the heat and mass transfer performance of NDWCTs at different scales. Liang et al. [
10] proposed a Genetic Algorithm (GA) considering a variable frequency drive (VFD) to optimize the industrial circulating cooling water system to obtain accurate operating parameters. The interaction between the pump and the cooler networks was examined. The results showed that the model can determine the accurate operating parameters of the pump system and valves. Zhang et al. [
11] used a coupling algorithm of an artificial neural network (ANN) optimized by establishing the coupling algorithm of the GA–BP neural network with the heat transfer model of the condenser and the air-cooled heat exchanger to obtain the air mass flow rate into the natural draft dry cooling tower (NDDCT). By adjusting the circulating cooling water mass flow to the optimum value, at least 16,515 kWh of circulating pump power consumption could be saved. Bueso et al. [
12] proposed a machine learning method based on the Multilayer Perceptron to estimate the thermal performance of cooling towers used in the desalination process. These studies validated the accuracy and efficiency of machine learning methods in process predictions.
Table 1 summarizes the optimization methods used in the study.
The above investigation shows that machine learning is an effective method for solving practical process problems, including solving the problem of cooling tower fan power prediction in process operations. In this paper, the optimization of the circulating water system in a Fluid Catalytic Cracking (FCC) unit at a refinery based on the industrial operating data is conducted, and a prediction system is designed and built. The cooler network is reformed to decrease the water consumption. Next, the fan power of the cooling tower is predicted using an optimized Gradient Boosting Regression (GBR) model. Based on the above studies, the saved water amount in the cooler network and the fan power in the cooling tower after transformation are calculated, and the total economic cost and gas emissions before and after optimization are discussed. The following expected goals are achieved:
The optimized cooler network scheme is implemented to reduce water consumption.
The industrial data are analyzed, cleaned and normalized to achieve the purpose of data visualization.
The fan power of the cooling tower is predicted, which could be used to improve operational efficiency.
The optimization system is constructed to realize the monitoring of the operation status.
2. Optimization of the Cooler Network
The main purpose of this section is to retrofit the coolers of the circulating cooling water system in an FCC unit at a refinery. These arrangements are usually based on parallel configuration. This design method aims to maximize the cooling water savings and optimize the cooler network structure.
2.1. Optimization Method
In the conventional cooler network, the circulating water delivered from the supply system to the devices generally enters the coolers in the parallel structure, as plotted in
Figure 1a. After heat exchanging, the circulating water is collected and returned to the cooling tower. This parallel design of the cooler network has shown the disadvantage of large water consumption and low efficiency of the cooling tower.
In this study, the cooler network of the circulating water system is reformed from the parallel structure to the series scheme, as shown in
Figure 1. In the series network structure, the circulating water flows through more than one cooler to reduce its consumption.
In most cases, the parallel-to-series modification of the cooler network only requires some additional cooler connecting pipes, making the work relatively simple, inexpensive and welcomed. The design steps are as follows [
2]:
- (1)
Determine the coolers to be modified;
- (2)
Determine the series sequence of the coolers;
- (3)
Calculate the fresh cooling water demand for each cooler;
- (4)
Calculate the cooling water demand for the entire system;
- (5)
Determine the network structure of the transformation.
2.2. Optimization Object
The investigated circulating water system is located in an industrial FCC unit, as shown in
Figure 2a. The investigated unit contains a reaction–regeneration system and a fractionation system in which the hot streams, such as naphtha, light diesel and rich gas, enter the coolers in the water using system, as plotted in
Figure 2c.
Figure 2b illustrates the water supply system, which is composed of a 15,000 m
3 natural draft circulating water tower V-100 and six circulating water pumps, with a total flow rate of 5300 m
3/h. The cooling water in the water supply system enters the coolers in the water using system through different pumps. In the water using system in
Figure 2c, the cooling water enters seven coolers and then enters a cooling tower T-101 to take away the obtained heat. Next, the water is returned to the water supply system.
The details of the coolers in the water using system are listed in
Table 2.
2.3. Optimization Scheme
The temperature–enthalpy diagram of the heat transfer process for the coolers is shown in
Figure 3. The abscissa is the enthalpy value and the ordinate is the stream temperature. The upper solid line with higher temperatures is the process stream line to be cooled, and the lower solid line is a special circulating water line, which is completely parallel to the process stream line and has the same enthalpy. The temperature difference between the two lines is the pinch point temperature difference. This special circulating water line is called the limit cooling water line, which defines the limit value of the cooling water. The actual cooling water line should not be higher than this limit line to avoid the heat transfer temperature difference being smaller than the pinch point temperature difference [
13].
In order to achieve the global optimization of the cooling water network, the water consumption of the entire system must be considered as a whole.
Figure 4 shows the composite temperature–enthalpy diagram of the FCC circulating water system.
The red curve in the figure represents the process logistics line, the green curve refers to the limit cooled water line, and the blue curve is the water supply line. The temperature difference at the pinch point is set to 30 °C. In order to minimize the amount of cooling water, the outlet temperature should be increased as much as possible, and the slope of the water supply line should be increased. When the slope of the water supply line increases to a point where it begins to coincide with the limit compound curve, the outlet temperature reaches a maximum, and the water consumption reaches a minimum. The pinch temperature at this point is determined to be 55 °C.
Table 2 shows that the limit inlet temperatures of the coolers E1302AB, E1319AB and E1212 are higher than the pinch point temperature, which means the cooling water here could be from the outlet of other coolers rather than directly from the cooling tower. The limit outlet temperatures of the coolers E1311A-H, E1314, E1203AD and E1218 are lower than the pinch point temperature, indicating that the water here could be utilized before going back to the cooling tower. Therefore, the influence of the pinch point on the modification of the cooler network should be taken into account to obtain the optimized scheme.
Based on the pinch technology and considering the principle of short pipeline transformation and smooth flow, the parallel system in
Figure 2c is optimized to the series structure. Three modifications were conducted, as shown in
Figure 5. To meet the heat exchanging requirements, the minimum heat transfer temperature difference is set to 5 °C.
There are three modifications for the existing network: (1) The circulating water steams from cooler E1212 and E1218 are combined with the fresh circulating water to supply water for E1203A-D. (2) The circulating water from cooler E1314 is mixed with the fresh water and enters E1311E-H. (3) The circulating water from cooler E1319AB and the fresh water flow into E1311 A-D together. The optimization scheme can decrease the circulating water volume of coolers E1203A-D, E1311E-H and E1311A-D and reduce the water load of the entire system.
In order to investigate the transformation effect of the optimization system, the process simulation of the modified scheme is carried out using ASPEN HYSYS, as shown in
Figure 6.
The water flow rates in the series network obtained by simulation are shown in
Table 3 compared with the calculated values using Formula (1).
where
is the heat load,
stands for the specific heat capacity of the water, and
represents the temperature difference between the inlet and outlet of the cooling water.
The results show that the error between the simulated and the calculated values is approximately 3%, indicating the reasonability of the simulation. The difference may be due to the fact that the theoretical calculation usually assumes the steady state and unchanged properties of the fluid, while the process simulation can dynamically consider the effect of the operating conditions on the fluid properties.
The simulated total water consumption of the original and optimized systems is 608.4 kg/s and 541.11 kg/s, respectively. The water reduction is 11%, which demonstrates the feasibility of the optimization.
3. Prediction of the Cooling Tower Fan Power
In
Figure 2c, the return water after heat exchanging in the cooler network is cooled in the cooling tower. The real time prediction and controlling of the fan power of the cooling tower is important, as the operation of the tower has a great impact on the energy conservation of the entire system. In this section, the fan power of the cooling tower is predicted by adopting the machine learning approach.
3.1. Algorithm Introduction
Six machine learning algorithms are employed to obtain the optimal model, as shown in the following:
- (1)
Bayesian Regression (BR) Model [
14]: This model is a probabilistic framework based on Bayes’ theorem and is intended for prediction and classification tasks. It integrates existing prior knowledge and data sets, and it constantly adjusts the probability model by applying Bayes’ theorem to predict future events or unknown variables.
- (2)
Linear Regression (LR) Model [
15]: This is a widely used technique in machine learning that aims to predict the linear relationship between a continuous target variable (also known as the response variable) and one or more independent variables.
- (3)
Elastic Network Regression (EN) Model [
16]: The EN model performs well when dealing with highly correlated problems, especially when the number of independent variables is large and the dependent variables are strongly correlated.
- (4)
Support Vector Regression (SVR) Model [
17]: Unlike traditional regression methods, this model’s core idea is to find a function that can be as close as possible to the data while ensuring that the deviation between the predicted value and the true value remains within acceptable limits.
- (5)
Gradient Boosting Regression (GBR) Model [
18]: The basic idea of the algorithm is to construct the prediction model through an iterative process, trying to fit the negative gradient direction of the target variable in each step, to reduce the loss function value of the current model.
- (6)
Random Forest Regression (RF) Model [
19]: The algorithm performs the regression task by integrating multiple decision trees. Its core idea is to aggregate the prediction results of multiple decision trees to determine the final output rather than relying solely on the prediction of a single decision tree.
Four evaluation indicators are used to evaluate the accuracy of the models, i.e., the Explained Variance (EV), Mean Absolute Error (MAE), Mean Square Error (MSE) and R-squared (R
2). The evaluation index is calculated by Formulas (2)–(5), where
is the true value,
is the predicted value, and
stands for the amount of data.
3.2. Data Processing
Data processing is a crucial pre-processing procedure used to obtain high-quality data sets. It includes two processes, i.e., data cleaning and data normalization.
The main purpose of data cleaning is to eliminate the outliers in the data and avoid the influence of wrong data on the accuracy of the prediction. The operating data of six variables are collected from the factory, including the flowrate of the circulating water, the temperature of the water going in and out of the cooling tower, the daily average ambient temperature, the daily relative humidity of the air and the fan power of the cooling tower. Each variable has 284 pieces of data, as shown in
Table 4.
A box plot analysis is used to visualize the data [
20], providing the median, quartile and outlier information of the data. The definition formula of the outlier is shown in Formula (6). The box plots of the selected parameters are plotted in
Figure 7. It is clear that there is an outlier in the dataset of the temperature of the water out of the cooling tower. Thus, this data set is deleted.
where
represents the outlier,
is the value at the 75% position after the data are sorted from small to large,
is the value at the 25% position after the data are sorted from small to large, and
stands for the difference between
and
.
As the collected data have different sizes and data distribution ranges, direct regression training may exaggerate some variables and affect the accuracy of the output target. In order to reduce the subsequent training error, Equation (7) is usually used to normalize the data.
where
and
represent the values before and after data normalization.
and
are the maximum and minimum values in the sample data, respectively.
After the data normalization, 70% of the retained 283 sets of data are used as the training data, and the other 30% are reserved as the test data. The fan power of the cooling tower is set as the target function. The other five parameters are determined as the input variables to carry out the subsequent prediction.
3.3. Algorithm Selection
The six machine learning models introduced in 3.1 are employed to predict the cooling tower fan power.
Figure 8 plots the comparisons between the predicted values and the industrial values. It is shown that the fitting effect of the EN model is the worst. The coincidence degree between the industrial values and the predicted values of the GBR model and the RF model is higher than that of the other algorithms.
The evaluation indexes of the six prediction algorithms are shown in
Table 5. Compared to the other models, the GBR model demonstrates the highest EV and R
2, the lowest MSE and a relatively low MAE. Thus, it can be concluded that the GBR model is superior to the other models, and it is selected as the optimal model for the subsequent prediction.
3.4. Model Prediction
3.4.1. Prediction Method
The GBR model is an ensemble learning method that continuously improves the performance of the model by iteratively training a series of weak learners [
21]. Its basic idea is to construct the next model by fitting the difference between the actual value and the predicted value to gradually reduce the prediction error of the model on the training data. The prediction steps are shown in
Figure 9.
- (1)
Select the data set, the appropriate target variables and the characteristics;
- (2)
Initialize the model. The model function is shown in Formula (8), where
is the actual value of the target variable corresponding to each sample in the training set,
is the predicted value of the model, and
represents the loss function, which is a standard for measuring the difference between the predicted value of the model and the true value;
- (3)
Calculate the residual error, as shown in Formula (9), where
is the current predicted value;
- (4)
Take the residual error calculated in the previous step as a target variable, and fit a new base learning machine;
- (5)
Multiply the weight of the new base learner by a learning rate and then add it to the current model;
- (6)
Stop the training if the preset maximum number of iterations is reached or the model error is converged. Otherwise, return to (3) to continue the iteration;
- (7)
Output the final trained model when the stop condition is met.
3.4.2. Hyperparameter Optimization
A grid search is a systematic hyperparameter tuning method that searches the predefined hyperparameter space exhaustively to find the best hyperparameter combination. The main parameters affecting the GBR model are the estimators, max depth, min samples split, min samples leaf and learning rate. Since the values of the model parameters are unknown, we set the approximate range of each parameter, as shown in
Table 6.
Based on the setting of these parameters, the grid search is added to the GBR model, and the number of cross-validations is set to 10 to perform different parameter combinations to obtain the test scores of multiple groups.
Table 7 shows the results of the partial cross-validation. At the same time, the MSE loss function curve is visualized to observe the changes in the scores of the hyperparameter combinations, as shown in
Figure 10.
In the process of the model training, the optimal values of the relevant parameters of the model are found using a single variable method, as shown in
Table 8. The results show that the MSE score of the GBR model is reduced from 0.005814 to 0.004900, and
R2 is increased from 0.835860 to 0.897068, indicating that the accuracy of the model is further improved.
3.4.3. Prediction Results
The fan powers at different conditions are predicted by the selected model after the hyperparameter optimization and then compared with the test data, as shown in
Figure 11. It can be seen that the predicted values coincide with the actual data, indicating the accuracy of the prediction.
It is predicted that the total power of the fan power before the optimization is 65,138.54 kW, which is reduced to 59,760.50 kW after the optimization. The fan power reduction is approximately 8%.
The system maintenance interface is developed by using this model. In this interface, to evaluate the influence of the parameters on the stable operation of the cooling tower, the input variables affecting the fan power are sorted, as shown in
Figure 12. It can be seen that the daily average ambient temperature has the greatest impact on the fan power, while the circulating water temperature entering the cooling tower has the smallest impact. By ranking the input variables, we can identify the factors with a greater impact on the fan power and prioritize these factors to maximize the power output, and it is possible to develop a more effective preventive maintenance program.
3.5. Comparison with Literature
In this section, we evaluate the performance of our model by comparing the performance of the GBR model with the findings in the literature on predictive cooling towers. Our goal is to provide valuable insights into the ongoing body of knowledge.
Many researchers use the Root Mean Squared Error (RMSE) [
23,
24] to evaluate the prediction accuracy, and the prediction results using various methods in different studies are shown in
Table 9.
It can be seen from the table that the GBR model has good prediction accuracy in the prediction of different performances of cooling towers, which is of great significance in solving practical problems and promoting technological progress.
5. Conclusions
In this paper, the optimization and analysis of the circulating water system of an industrial FCC unit are carried out. The parallel-to-series modification of the cooler network is conducted and evaluated by process simulation. The fan power of the cooling tower is predicted based on machine learning. Next, the economic and environmental benefits are discussed. It is revealed that the process of machine learning can effectively predict and optimize the industrial process, and the parallel-to-series modification can not only decrease the water consumption but also reduce the electricity usage. The details are as follows:
- (1)
Three series modifications of the cooler network are made: E1212 and E1218 are connected to E1203A-D in series, E1314 is directly connected to E1311E-H, and E1319AB is connected to E1311 A-D. The new network consumes 1948 t/h of water, with a reduction of 11% compared to the original structure.
- (2)
The fan power of the cooling tower is chosen as the target function, and the water flowrate, the temperature of the water going in and out of the cooling tower, the daily average ambient temperature and the daily relative humidity of the air are used as the input variables. A total of 284 industrial data sets with the above parameters are sampled to train six machine learning algorithms. The GBR model, which has the best fitting effect, is determined and optimized as the prediction model.
- (3)
The optimized GBR model can accurately predict the fan power of the cooling tower, and the calculated power reduction by the series-to-parallel retrofit is approximately 8%. Meanwhile, the machine learning method indicates that the daily average ambient temperature is the input variable that has the greatest influence on the fan power.
- (4)
Considering the economic and environmental benefits, the economic cost is minimized by 8.65% due to the decrease in water and electric consumption, and the gas emissions decreased by 2142.06 kg/h.
In most cases, the cooling tower does not need to operate at full load, so through the precise control of the fan power, that is, through variable frequency regulation, unnecessary energy consumption can be greatly reduced. The prediction system also enhances the system’s reliability and stability. By knowing the possible load of the fan power in advance, it can avoid the impact of sudden load changes on the system and reduce the risk of failure and unexpected shutdown, which is particularly important for industrial cooling systems that need continuous and stable operation.