1. Introduction
Rare-earth elements (REEs) have become national strategic resources due to their irreplaceability in critical fields such as electronics, aerospace, and defense. China has the world’s largest reserves of rare earths and is a global leader in solvent extraction and separation technology [
1]. However, the current production process still relies heavily on offline sampling, empirical adjustments, and manual intervention, resulting in low automation levels, inefficient operations, high energy consumption, and significant product quality fluctuations, which pose major obstacles to industrial upgrading. Real-time monitoring of the concentration of individual rare-earth elements within extraction mixer settlers is a prerequisite for optimizing the ratio of extractants and scrubbing agents and for achieving closed-loop control. Currently, industrial practices mainly depend on offline testing methods such as ICP-AES (Inductively Coupled Plasma–Atomic Emission Spectrometry) [
2], ICP-MS (Inductively Coupled Plasma–Mass Spectrometry) [
3], XRF (X-Ray Fluorescence Spectrometry) [
4], and UV-Vis spectrophotometry [
5]. These techniques are characterized by long detection cycles, potential radiation hazards, complex instrumentation, and high maintenance costs, making them unsuitable for real-time control requirements. Consequently, research into rapid detection methods for rare-earth element content in extraction processes is of significant academic value and engineering importance.
In recent years, with the proliferation and application of intelligent modeling methods such as support vector machines and neural networks in process control, data-driven soft sensor modeling approaches have also been increasingly adopted in rare-earth extraction and separation processes. References [
6,
7] developed soft sensors for predicting rare-earth component content by using key influencing factors, such as feed flow rate, extractant flow rate, scrubbing solution flow rate, and feed composition, as auxiliary variables, and the corresponding element content as the primary variable.These models, integrated with conventional machine learning algorithms, achieved predictions that meet actual production standards. Reference [
8] investigated a model relating first-order moments of image color features in
color space to component content, enabling the detection of individual element concentrations in mixed solutions. To address real-time production requirements, reference [
9] proposed an improved GRA-JITL-LSSVM model for online monitoring of component content in the rare-earth extraction process. This approach uses gray relational analysis (GRA) to assess trends and correlations between input and output variables and introduces a database update criterion to enhance anti-interference capability. A genetic algorithm with a stagnation backtracking strategy (SBS-GA) was also proposed to ensure global optimization of model parameters. With advances in artificial intelligence, deep learning has become a major focus in machine learning. Reference [
10] employed a convolutional neural network (CNN) to extract abstract representations from raw images of
(Praseodymium/Neodymium) mixed solutions and constructed a regression model using a deep neural network to predict the content of each element. Reference [
1] proposed a soft measurement method integrating a transfer-learning-based residual attention convolutional network, which takes both explicit and implicit features of rare-earth solutions as model inputs. The method uses a one-dimensional CNN integrated with multi-residual attention blocks to alleviate gradient vanishing or explosion issues and incorporates transfer learning to significantly improve the training effectiveness of the target network. However, as network depth and parameter count increase, training time may become prolonged, and overfitting can become more severe. Moreover, in actual rare-earth extraction processes, changes in operating conditions significantly affect component content, and the presence of non-stationary data leads to poor convergence in the prediction models previously mentioned.
A Stochastic Configuration Network (SCN) is a novel feedforward neural network and has been widely applied in data-driven modeling in various engineering cases [
11]. For instance, reference [
12] implemented a soft sensor for ammonia nitrogen concentration detection using a genetic algorithm-optimized SCN. In reference [
13], an improved zebra optimization algorithm was employed to determine the optimal hyperparameter combination of the SCN, enhancing network performance for short-term photovoltaic power forecasting. Reference [
14] introduced a bidirectional SCN (BSCN) with a semi-random learning mechanism and applied it. Reference [
15] utilized an
-regularized SCN to identify the operating conditions of ball mill loads, thereby improving ore grinding efficiency and reducing operational costs. Reference [
16] proposed a two-dimensional SCN (2DSCN) for modeling image datasets. Furthermore, reference [
17] extended the original SCN by developing a block-incremental SCN to improve learning efficiency and enable rapid modeling in industrial processes. In the context of predicting rare-earth component content, several researchers have also adopted SCN-based data-driven modeling approaches. For example, reference [
18] applied an improved differential evolution algorithm to optimize the weights and biases of nodes in a block-incremental SCN, resulting in a more compact model for predicting rare-earth element concentrations. However, the hyperparameters of SCN are often set empirically, and different combinations can significantly impact network performance. There is a critical need for systematic methods to adaptively optimize these hyperparameters to enhance the applicability and reliability of SCN in rare-earth extraction processes.
Evolutionary computation employs optimization methods inspired by biological evolution. Its core idea originates from natural selection and mutation, achieving optimization through the evolution of population fitness [
19]. Furthermore, this approach has been successfully applied to the training process of neural networks. With the continuous development of meta-heuristic algorithms, swarm intelligence optimization algorithms—similar to evolutionary computation—draw inspiration from the social behaviors of biological organisms to simulate various collective behaviors. Examples include the Gray Wolf Optimizer (GWO), Whale Optimization Algorithm (WOA), and Dung Beetle Optimizer (DBO). Such algorithms have been widely used in various fields, such as energy-efficient train operation, fault detection, and network optimization. The Black-winged Kite Algorithm (BKA) is a novel swarm intelligence optimization algorithm proposed in 2024 [
20]. Inspired by the attacking and migratory behaviors of black-winged kite populations, the algorithm iteratively moves individuals toward optimal positions by simulating these behaviors. It offers advantages such as high convergence accuracy, few adjustable parameters, and low complexity. Therefore, this paper adopts BKA to optimize the key hyperparameters of the Stochastic Configuration Network (SCN). However, the standard BKA suffers from insufficient population diversity and a tendency to fall into local optima. To address these limitations, an improved BKA is proposed, aiming to enhance its population diversity and global search capability. The main innovations and contributions of this paper are summarized as follows:
The global search capability of the BKA is enhanced by incorporating good point set initialization and Lévy flight random-walk strategies, and its convergence is rigorously proven and analyzed.
The IBKA is employed to optimize the constraint parameters and weight-scaling factors in the hyperparameter tuning of the SCN, resulting in a novel method named IBKA-SCN.
The proposed IBKA-SCN is applied to predict rare-earth element component content in a practical engineering case. Experimental results demonstrate that the prediction accuracy of SCN is significantly improved, and the performance of IBKA-SCN meets the requirements of industrial applications.
The structure of this paper is as follows, with an overview provided in
Figure 1:
Section 1 of the text serves as an introduction, in which the research background and its significance are outlined.
Section 2 reviews relevant foundational work, including the SCN model and BKA algorithm.
Section 3 elaborates on the improvement strategy for IBKA and presents experimental analysis.
Section 4 details the construction of the IBKA-SCN model.
Section 5 explains how the experimental validation and analysis were conducted. Finally,
Section 6 summarizes the findings and outlines future work.
4. The Establishment of the IBKA-SCN Model
The hyperparameter settings in the Stochastic Configuration Network (SCN) directly influence both the performance and modeling efficiency of the network. Among these, the two most critical hyperparameters are
r and
. A relatively small value of
r implies a looser inequality constraint. This value gradually increases during the construction process of the network until it approaches 1.
serves as a scaling factor. The parameters of the candidate nodes in the hidden node pool are randomly generated within the range
. Those nodes that satisfy the inequality constraint, as shown in Formula (
2), are selected as new nodes and added to the hidden layer of the network.
In the conventional SCN, the hyperparameters r and are predefined as non-negative increasing sequences, such as and . This approach not only increases the complexity of the model but also reduces its construction efficiency and makes it more prone to overfitting. The selection of r and should be data-dependent rather than relying on a fixed, manually specified sequence. Therefore, to improve the accuracy of SCN in predicting rare-earth element component content, we propose the use of the Improved Black-winged Kite Algorithm (IBKA) to optimize the hyperparameter combinations of SCN. This method is named IBKA-SCN. The detailed algorithm workflow is as follows:
Step 1: Set the initial black-winged kite population size to N, the solution dimension to D, and the maximum number of iterations to T. Define the upper bound and lower bound for the hyperparameters r and . Specify the maximum number of hidden nodes , the maximum number of candidate nodes , and the tolerance error .
Step 2: Employ the Root Mean Square Error () as the fitness function of the Improved Black-winged Kite Algorithm (IBKA) to explore the optimal combination of hyperparameters for the SCN.
Step 3: Determine whether the IBKA-SCN model has reached the preset tolerance error or the maximum number of iterations.
Step 4: Assign the optimal hyperparameters, corresponding to the best solution (position) found by the Improved Black-winged Kite Algorithm (IBKA), to the SCN. Then, establish a predictive model for rare-earth element component content using the SCN with these optimized parameters.
The pseudocode of IBKA-SCN is presented in Algorithm 1.
| Algorithm 1: IBKA-SCN |
![Applsci 15 10880 i001 Applsci 15 10880 i001]() |
5. Experiments and Result Analysis
The experiments were conducted on the MATLAB 2022b (MathWorks, Natick, MA, USA) (version 9.13.0) platform using the following hardware specifications: an Intel® Core™ i5-14650HX (Intel Corporation, Santa Clara, CA, USA) 2.20 GHz processor, 16 GB of RAM (Samsung Electronics, Seoul, South Korea), a 64-bit Windows (Microsoft Corporation, Redmond, WA, USA) (version 22H2, build 22621) operating system, and an NVIDIA GeForce RTX 2060 (NVIDIA Corporation, Santa Clara, CA, USA) graphics card.
5.1. Performance of SCN Based on the IBKA Algorithm
This section evaluates the generalization capability of IBKA-SCN using four regression datasets from the KEEL (Knowledge Extraction based on Evolutionary Learning) repository, as detailed in
Table 3; these datasets have been widely adopted in references [
25], enabling a direct comparison under identical experimental settings. A comparative analysis is conducted for IBKA-SCN, IBKA1-SCN (IBKA-SCN only with good node sets), IBKA2-SCN (IBKA-SCN only with Lévy random-walk strategies), BKA-SCN (SCN with Black-winged Kite Optimization Algorithm), IRVFL [
25], SCN, BSCN [
17] (SCN with block increments), WOA-SCN (SCN with Whale Optimization Algorithm), and GWO-SCN (SCN with Gray Wolf Optimizer) to validate the effectiveness of the proposed method. All comparison models underwent the same data pre-processing procedures and dataset splits. Their hyperparameters were also systematically optimized to achieve peak performance. In IRVFL, the weights and biases are randomly assigned from a uniform distribution within the range
. For SCN and its variant, the hyperparameters
r and
are set to
and
, respectively. In SSA-SCN, WOA-SCN, GWO-SCN, IBKA-SCN, IBKA1-SCN, and IBKA2-SCN, the population size of the algorithm was set to 100, the maximum number of iterations was set to 1000, and the lower and upper bounds for
r are set to 0.9 and 0.9999, respectively, while for
, they are set to 0.5 and 250, and the tolerant error is set to 0.005. To ensure a fair evaluation, 80% of each dataset is randomly selected as the training set, with the remaining 20% used for testing. All input features are normalized to mitigate the impact of varying data scales across datasets.
The test results of IRVFL, SCN, SSA-SCN, and IBKA-SCN on the four datasets are presented in
Table 4,
Table 5,
Table 6 and
Table 7. Each model is validated through sampling from 50 repeated random subsets. The performance was evaluated using the model construction time (T), standard deviation, and the mean of the Root Mean Square Error (
, as shown in (
19)), along with the average training time.
Here,
denotes the predicted output of the model,
represents the actual value of the sample, and
N is the number of samples.
A comprehensive analysis of the results presented in
Table 4,
Table 5,
Table 6 and
Table 7 reveals that IRVFL exhibits significantly lower accuracy in both training and testing across all four datasets compared to the other algorithms, making it inadequate for practical applications. The experimental findings of this study are primarily centered around model training efficiency and prediction accuracy. Firstly, in terms of training efficiency, the BSCN model, leveraging its unique node addition mechanism, significantly shortened the training cycle compared to the standard SCN. Similarly, employing meta-heuristic algorithms to optimize the key hyperparameters (
and
r) of SCN also effectively reduced its training time. However, hyperparameter optimization demonstrated greater value in terms of prediction accuracy. Experiments showed that the SCN model with optimized
and
r achieved higher prediction accuracy than both the original SCN and BSCN. Among all models, the IBKA-SCN, which integrates all optimization strategies, attained the highest prediction accuracy across all tested datasets. The superiority of this model was explained at a mechanistic level through ablation studies. Compared to the original baseline BKA-SCN, the introduction of the good point set initialization strategy in IBKA1-SCN established a better foundation for the optimization process by enhancing population diversity. Building upon this, the further incorporation of the Lévy flight random-walk strategy in IBKA2-SCN strengthened its global search capability and ability to escape local optima. Ultimately, the IBKA-SCN model, which amalgamates all the aforementioned strategies, achieved the optimal performance, confirming the effectiveness and synergistic effect of each improvement measure.
In conclusion, the proposed IBKA-SCN method not only enhances the regression accuracy of SCN but also effectively reduces training time, demonstrating a combination of high precision and computational efficiency.
5.2. Comparison Experiment of Rare-Earth Element Component Content Prediction Based on IBKA-SCN
Rare-earth extraction and separation is a process that involves separating and purifying mixed rare-earth solutions to obtain target purity products. We consider the extraction separation process for two components
A and
B, where
A is the easily extractable component and
B is the difficult-to-extract component. The production separation flow is shown in
Figure 5. From left to right, the diagram includes an extraction section comprising
n stages of mixed settling tanks, followed by a washing section comprising
m stages of mixed settling tanks. Here,
denotes the feed flow rate of the rare-earth solution,
represents the extractant flow rate, and
indicates the washing agent flow rate, while
and
, respectively, denote the component flow rates of the rare-earth feed solution.
In practical production scenarios, numerous variables can influence component content; however, only a limited number of auxiliary variables can be accurately measured in real industrial processes. These primarily include feed concentration, scrubber flow rate, and extractant flow rate, among a few other parameters [
6]. Therefore, it can be established that there exists a nonlinear mapping relationship between the rare-earth element component content
and the extractant flow rate
, scrubber flow rate
, feed flow rate
, and feed concentration
, which is expressed by Equation (
20) as follows:
The dataset used in this study was collected from the
extraction section of a rare-earth separation enterprise. It consists of 200 data samples gathered from multiple monitoring points within the same production line during the same time period. According to the process control requirements of the extraction production, a specific stage in the extraction section was selected as the monitoring point to measure the aqueous-phase content of
components. To validate the effectiveness of the proposed IBKA-SCN model, its performance was compared with several commonly used modeling methods for component content prediction, including SVM, GA-BP, SCN, and SGDE-SCN. Among the 200 collected samples, 80% were randomly selected as the training set
, and the remaining 20% were used as the test set
. All input and output variables were normalized to the range
. The performance of each model was evaluated using the following metrics:
(as shown in Equation (
19)), Mean Absolute Error (
) and Sum of Squared Errors (
), as shown in Equations (
21) and (
22), and the training time (
T). These indicators collectively assess the accuracy and efficiency of the predictive models.
Here,
denotes the predicted output of the model,
represents the actual value of the sample, and
,
N is the number of samples.
For the SVM model, the penalty coefficient was set to
and the kernel parameter to
. For the GA-BP (a BP neural network optimized by a genetic algorithm), the maximum number of hidden nodes was set to 8, the learning rate to 0.01, and the maximum number of iterations to 1000. In the case of SCN and its variants, the hyperparameter control sets for
and
r were defined as
and
, respectively. For IBKA-SCN, the lower and upper bounds for
were set to 0.5 and 250, respectively, and for
r, to 0.9 and 0.9999. The performance of each predictive model on the component content dataset is summarized in
Table 8. The model prediction output, alongside the actual values, is displayed in
Figure 6.
According to the experimental results presented in
Figure 6 and
Table 8, IBKA-SCN achieves the lowest values in RMSE, MAE, and SSE, indicating that its prediction accuracy surpasses that of SVM, GA-BP, SCN, and SGDE-SCN. Meanwhile, the training time of IBKA-SCN remains around 2.4 s, which is significantly lower than that of SGDE-SCN and SVM and is comparable to SCN while delivering a substantial improvement in accuracy. In summary, while retaining the advantages of SCN—such as structural simplicity and fast convergence—IBKA-SCN further optimizes the hyperparameter combination through IBKA to better adapt to the specific dataset. This leads to simultaneous enhancement in both prediction precision and training efficiency, demonstrating the feasibility and effectiveness of the proposed method in practical applications.
6. Conclusions
The accurate and efficient prediction of component content in the rare-earth extraction and separation process plays a decisive role in the design of control systems, product quality management, and energy optimization. In response to this key issue, this paper proposes the IBKA-SCN soft sensor model for predicting component content. The main contributions are summarized as follows:
To overcome the limitations of the Black-winged Kite Algorithm (BKA) in terms of population diversity and global optimization capability, an Improved Black-winged Kite Algorithm (IBKA) is developed by introducing good point set initialization and Lévy flight strategies. The model exhibits optimal convergence accuracy across a series of five function regression tasks. A convergence analysis is provided to substantiate its theoretical validity.
To address the reliance on manually configured hyperparameters in the Stochastic Configuration Network (SCN), which often limits model accuracy, IBKA is employed to adaptively optimize the hyperparameter combinations for SCN. The proposed IBKA-SCN method constructs the network via stochastic configuration after identifying the optimal hyperparameters. Extensive experiments on four real-world regression datasets demonstrate that IBKA-SCN outperforms other benchmark models, confirming its superior generalization capability and effectiveness.
This study constitutes the first attempt to integrate the improved BKA with the SCN framework for hyperparameter optimization in the practical application of soft sensing for rare-earth component content. The results of the test indicate that IBKA-SCN achieves the smallest values of RMSE, MAE, and SSE, indicating the highest predictive accuracy. Furthermore, it requires significantly less training time compared to SGDE-SCN and SVM, validating its feasibility and efficiency in real industrial scenarios.
In conclusion, the IBKA-SCN model proposed in this study not only provides a novel methodological framework for hyperparameter optimization in randomized neural networks but also offers significant engineering value in the accurate prediction of component content in rare-earth extraction processes. This work establishes a solid foundation for future applications such as online soft sensing, real-time control, and the development of digital twins for full-process optimization.
This paper acknowledges certain limitations in the method proposed and suggests feasible directions for future research. Current methods cannot adaptively adjust parameters of hidden layer nodes or add/remove nodes to enable the model to update itself online.