3.1. Selection of Model Parameters
The proposed MS-1D-CNN model consists of three parallel sub-networks, each designed as a 1D-CNN with a distinct convolution kernel size. This multi-scale structure enhances feature extraction by capturing patterns at different scales. To improve the model’s generalization and robustness, an ensemble learning approach is employed. The ensemble learning framework optimizes the parameters of MS-1D-CNN by leveraging the optimal settings derived from individual 1D-CNN models. For initialization, the model parameters follow the PyTorch default initialization tool to ensure stable training. The convolution operation is configured with a step size of 1, while padding is adjusted based on the kernel size to maintain the original feature shape. Pooling operations are performed with a kernel size of 2 and a step size of 2 to reduce feature dimensionality while preserving essential information. The fully connected layers consist of two layers with 128 and 32 neurons, respectively, allowing for effective feature transformation before output. Adam is selected as the optimizer, with a learning rate of 0.0001 to ensure smooth convergence. The training process is conducted with a batch size of 20 and spans 200 epochs to facilitate comprehensive learning. Several key parameters significantly influence the model’s performance. These include the convolutional kernel size, the number of convolutional kernels, and the number of convolutional layers. Additionally, the number of neurons in the fully connected layers and the depth of the fully connected layers also impact predictive accuracy. In this study, these parameters will be systematically optimized through experimental analysis to achieve the best performance.
(1) Convolutional kernel size
The convolutional kernel size plays a crucial role in feature extraction within a 1D-CNN. Small kernels focus on capturing fine-grained local features, which are essential for detecting subtle variations in spectral data. In contrast, large kernels encompass a broader range of regional information, allowing the model to capture long-range dependencies. Selecting an appropriate kernel size is critical for optimizing model performance. In this study, the dataset consists of UV-Vis spectroscopy data with a fixed length of 2048. To comprehensively analyze the impact of kernel size, multiple convolutional kernel sizes were tested. The sizes examined include 1, 3, 5, 7, 9, 11, 21, 51, and 101. Throughout these experiments, key structural parameters were kept constant to ensure a fair comparison. Specifically, the number of convolutional layers was set to 3, and each layer contained 32 convolutional kernels. The primary objective of this analysis was to investigate how different convolutional kernel sizes influence the prediction performance of COD in the 1D-CNN. The experimental results provide insights into the relationship between kernel size and predictive accuracy. A detailed summary of the findings is presented in
Table 2.
As shown in
Table 2, The convolution kernel size has a significant impact on the performance of COD prediction in the 1D-CNN model. An inappropriate kernel size may lead to suboptimal performance. Therefore, selecting the most effective kernel size is essential for optimizing the model. To evaluate the influence of different kernel sizes, the RMSEC was used as a performance metric. The ranking of the kernel sizes, based on RMSEC values, followed this order: 5, 3, 1, 7, 9, 101, 11, 51, and 21. Smaller kernel sizes (5, 3, and 1) yielded the lowest RMSEC, indicating superior predictive performance. Based on these findings, the MS-1D-CNN model assigns the top three kernel sizes, 5, 3, and 1, to its three parallel 1D-CNN sub-networks. This selection ensures that the model captures both fine local patterns and broader spectral features.
(2) Number of convolutional kernels
The number of convolution kernels plays a crucial role in feature extraction and representation in a 1D-CNN model. Each convolution kernel acts as a filter, detecting specific patterns in the input data. A higher number of convolution kernels increases the number of output channels in the feature map, allowing the model to capture a more diverse set of spectral features. This enhancement improves the model’s ability to distinguish subtle variations in COD-related spectral patterns. However, there is a trade-off when selecting the number of convolution kernels. While increasing the number of kernels enhances feature richness, it also raises the total number of parameters and computational complexity. Excessive kernels may lead to overfitting, where the model becomes too specialized to the training data and loses generalization capability. Therefore, finding the optimal balance is essential for improving predictive performance while maintaining computational efficiency. After determining the optimal convolution kernel size, this study further investigated the influence of the number of convolution kernels on COD prediction accuracy. To ensure a controlled evaluation, the convolution kernel size was fixed at 3 for all experiments in this analysis. The experimental findings, including the influence of different kernel quantities on model performance, are summarized in
Table 3.
As shown in
Table 3, the number of convolution kernels significantly influences the performance of COD prediction in the 1D-CNN model. An optimal selection of convolution kernels ensures a balance between feature richness and computational efficiency. To evaluate the influence of different kernel quantities, the RMSEC was used as a performance metric. The ranking of convolution kernel numbers, based on RMSEC values, follows this order: 32, 16, 64, 8, 128, 512, and 256. The results indicate that 32 convolution kernels achieved the lowest RMSEC, demonstrating the best predictive performance. The final selection of convolution kernel numbers is determined based on both RMSEC performance and the number of convolutional layers in the MS-1D-CNN model. This ensures that the model maintains an optimal balance between accuracy and computational efficiency.
(3) Number of convolutional layers
The number of convolutional layers in a 1D-CNN plays a pivotal role in the model’s ability to extract hierarchical features. Increasing the number of layers enhances the network’s capacity to capture more complex patterns from the input data, thereby improving its feature extraction capabilities. However, adding more layers also increases the number of parameters in the model, which in turn raises computational complexity. Excessively deep networks may become prone to overfitting, as they risk memorizing the training data rather than generalizing to unseen samples. Therefore, a careful balance between depth and model complexity is crucial for optimal performance. After determining the optimal convolution kernel size and the number of convolution kernels, this study further investigated the influence of the number of convolution layers on COD prediction accuracy. To ensure a controlled evaluation, the convolution kernel size was fixed at 3 and the number of convolution kernels was fixed at 32 for all experiments in this analysis. The experimental results, which highlight the influence of convolutional layers on model performance, are presented in
Table 4.
As shown in
Table 4, the number of convolutional layers significantly influences the performance of COD prediction in the 1D-CNN model. It is essential to consider the trade-off between increased depth and computational complexity, as too many layers can lead to overfitting and hinder generalization. To evaluate the influence of different convolutional layer quantities, the RMSEC was used as a performance metric. The ranking of the convolution layer quantities, based on RMSEC values, followed this order: 3, 2, 9, 4, 10, 7, 8, 5, 6, and 1. The results suggest that the model with 3 convolutional layers achieved the best performance, followed by networks with two and nine layers. Based on this ranking, the MS-1D-CNN model selected 3 convolutional layers for each of its 1D-CNN sub-networks. This choice ensured that the model struck an effective balance between feature extraction and computational efficiency.
(4) Number of neurons in the fully connected layers
The fully connected layer in a 1D-CNN plays a crucial role in integrating the features learned by the convolutional layers and producing the final output. The number of neurons in the fully connected layers directly influences the model’s ability to capture complex relationships between the extracted features. A higher number of neurons can increase the model’s capacity to represent intricate patterns and improve performance. However, there is a trade-off associated with increasing the number of neurons. More neurons lead to an increase in the number of parameters, which can raise the risk of overfitting. Therefore, it is essential to carefully select the number of neurons in the fully connected layers to achieve optimal performance without compromising generalization. After determining the optimal convolution kernel size, the number of convolution kernels, and the number of convolutional layers, this study further investigated the influence of the number of neurons in the fully connected layers on COD prediction accuracy. To ensure a controlled evaluation, the convolution kernel size was fixed at 3, the number of convolution kernels was fixed at 32, and the number of convolutional layers was fixed at 3 for all experiments in this analysis. The experimental results, detailing the relationship between the number of neurons in the fully connected layers and model performance, are presented in
Table 5.
As shown in
Table 5, the number of neurons in the fully connected layers plays a pivotal role in determining the performance of COD prediction in the 1D-CNN model. The optimal number of neurons must strike a balance between sufficient complexity and the risk of overfitting. To evaluate the influence of different neuron quantities, the RMSEC was used as a performance metric. The ranking of the number of neurons in the fully connected layers, based on RMSEC values, followed this order: 256, 32, 64, 512, 16, 4096, 128, and 1024. The results show that 256 neurons yield the best performance, followed by 32 and 64 neurons. The final selection of the optimal number of neurons was made based on the number of fully connected layers.
(5) Number of fully connected layers
The number of fully connected layers in a 1D-CNN significantly impacts the model’s ability to capture complex, nonlinear relationships between the extracted features. A balance must be struck between the number of fully connected layers and the risk of overfitting. After determining the optimal convolution kernel size, the number of convolution kernels, the number of convolutional layers, and the number of neurons in the fully connected layers, this study further investigated the influence of the number of fully connected layers on COD prediction accuracy. To ensure a controlled evaluation, the convolution kernel size was fixed at 3, the number of convolution kernels was fixed at 32, the number of convolutional layers was fixed at 3, and the number of neurons in the fully connected layers was fixed at 256 for all experiments in this analysis. The results, which provide insights into how the number of fully connected layers influences model performance, are presented in
Table 6.
As shown in
Table 6, the number of fully connected layers in a 1D-CNN model plays a crucial role in determining the performance of COD prediction. Therefore, selecting an optimal number of fully connected layers is essential for balancing predictive accuracy and computational efficiency. To determine the optimal number of fully connected layers, the RMSEC was used as a performance metric. The ranking of the number of fully connected layers, based on RMSEC values, follows this order: 2, 1, 3, 5, and 4. Based on this ranking, the optimal number of fully connected layers for the MS-1D-CNN model is set to 2. This configuration balances model complexity and performance, minimizing the risk of overfitting while maintaining high predictive accuracy.
The final parameters of the MS-1D-CNN model are determined based on extensive experimental analysis. The finalized parameters are summarized in
Table 7, providing a detailed overview of the model configuration. The MS-1D-CNN consists of three parallel 1D-CNN sub-networks, each designed to capture different feature scales. The selected convolution kernel sizes for these sub-networks are 1, 3, and 5. This multi-scale approach enhances the model’s ability to extract both fine-grained and broader spectral features, improving the overall predictive accuracy. Each 1D-CNN sub-network contains 3 convolutional layers to ensure effective feature extraction. The number of convolution kernels in these layers is set to 16, 32, and 64, respectively. This configuration allows the network to progressively capture more complex patterns while balancing model complexity and computational cost. At the final stage, two fully connected layers integrate the extracted features and generate the model’s output. The first fully connected layer consists of 256 neurons, facilitating robust feature fusion. The second fully connected layer contains 32 neurons, refining the learned representations before producing the final prediction. This architecture ensures both high accuracy and computational efficiency, making the MS-1D-CNN suitable for COD prediction based on UV-Vis spectroscopy.
3.2. Training Procedure
Adjusting hyperparameters is a critical step in optimizing deep learning models. The proper selection of hyperparameters directly affects model performance, training stability, and convergence speed. In this study, two key hyperparameters were analyzed: learning rate and batch size. These parameters play a fundamental role in determining the effectiveness and efficiency of model training. The learning rate is an essential hyperparameter in optimization algorithms. It controls the step size for updating model parameters during backpropagation. A learning rate that is too high can cause unstable training, leading to divergence. Conversely, a learning rate that is too low may result in slow convergence or becoming stuck in local minima. Selecting an appropriate learning rate is crucial for achieving optimal performance. The batch size defines the number of samples used to compute gradients in a single training iteration. A small batch size allows for more frequent updates, potentially improving model generalization but increasing training noise. A large batch size stabilizes updates and can speed up training on powerful hardware, but it may require careful tuning to avoid convergence issues. Balancing batch size is essential for efficient model training. The influence of the learning rate and batch size on training effectiveness and speed is systematically analyzed. Different combinations of these hyperparameters are tested to evaluate their influence on model convergence and performance. The experimental results, including the training performance for various learning rate and batch size settings, are presented in
Table 8.
The selection of learning rate and batch size has a significant influence on COD prediction performance in deep learning models. A well-chosen learning rate ensures stable and efficient training, while an appropriate batch size balances computational efficiency and model generalization. To systematically evaluate the influence of these hyperparameters, the RMSEC was used as the primary performance metric. The results reveal that the optimal combination is a learning rate of 0.0001 and a batch size of 32. This configuration achieves the best balance between training stability, convergence speed, and prediction accuracy.
The Adam optimization algorithm was employed to optimize the network parameters during the training process. This algorithm is widely known for its efficiency in adjusting the weights of the network in order to minimize the loss error. To ensure that the best-performing model was employed during the testing stage, the model was saved at the end of each epoch, as shown in
Figure 6. The loss value exhibited a significant decrease in the first 6 epochs and began to stabilize around the 75th epoch, indicating that the model was gradually converging. After 75 epochs, the loss value remained largely unchanged, indicating that the model was approaching convergence. The relatively quick convergence of the model can be attributed to the small sample size and the simple mapping relationship between the UV-Vis spectra and the COD values, which facilitated the learning process. Despite the model having converged, the final loss value only decreased to around 15, rather than reaching 0, which can be attributed to factors such as noise and turbidity in the spectra, which introduced some interference during training.
3.5. Comparison with Other Methods
To further evaluate the effectiveness of the proposed method, some traditional methods were used as a comparison. The performance of the proposed model was thoroughly evaluated by comparing its prediction and fitting accuracy with that of three widely recognized traditional methods and deep learning methods for COD prediction. Traditional methods included PLS [
34], SVM [
35], and artificial neural network (ANN) [
36]. Deep learning methods included three 1D-CNNss [
19,
22,
26]. These methods were selected due to their frequent application in similar predictive modeling tasks. The comparison of the models’ performance was based on the same performance indices, which are crucial for assessing their ability to accurately predict COD values. The detailed results of the comparison are presented in
Table 10, which summarizes the performance of each model. Additionally, the graphical representation of these results is shown in
Figure 9, providing a visual comparison that facilitates a better understanding of the relative strengths and weaknesses of each model. This comparison aims to demonstrate the effectiveness of the proposed method and its potential advantages over traditional methods in COD prediction tasks.
The comparison of methods, as shown in
Table 10 and
Figure 9, reveals clear distinctions in performance between the proposed method, traditional methods, and deep learning models. Among the traditional methods, ANN shows the best performance with an R
2 of 0.9286/0.9176. However, it still falls short of fully fitting the nonlinear relationship between spectra and COD. Its fitting performance is inferior to that of the proposed method. The other two traditional methods, PLS and SVM, exhibit even poorer fitting results. Meanwhile, ANN has lower RMSEC/RMSEP values (5.7691/6.1994), suggesting smaller prediction errors. Both PLS and SVM also exhibit inferior performance, with higher RMSEC/RMSEP, indicating that these methods have lower accuracy in COD prediction. Turning to deep learning methods, the 1D-CNN model referenced in previous studies [
22] delivers the best results among deep learning approaches, with an R
2 of 0.9412/0.9309 and RMSEC/RMSEP values of 5.2387/5.6779. However, the other two deep learning models in comparison perform worse than the 1D-CNN, demonstrating their relative inefficiency in capturing the spectral–COD relationship. The proposed method outperforms all these traditional and deep learning models, including PLS, SVM, ANN, and the three 1D-CNN models. It achieves a superior goodness of fit, with R
2 values of 0.9683/0.9599, which is significantly higher than those of the other methods. This indicates that the proposed method has a stronger capability to capture the complex nonlinear relationship between the spectral data and COD values. Moreover, the proposed method demonstrates a remarkable level of accuracy, with much lower RMSEC and RMSEP values of 3.8464/4.3259, confirming its exceptional precision in COD prediction. In addition to its accuracy, the robustness of the proposed method is evident. It not only provides better fitting results but also maintains high prediction accuracy, highlighting its ability to effectively extract meaningful features from the spectral data and deliver reliable COD predictions.