Research on Pattern Classification Based on Double Pseudo-Inverse Extreme Learning Machine

Yin, Yumin; Liao, Bolin; Li, Shuai; Zhou, Jieyang

doi:10.3390/electronics13193951

Open AccessArticle

Research on Pattern Classification Based on Double Pseudo-Inverse Extreme Learning Machine

¹

College of Computer Science and Engineering, Jishou University, Jishou 416000, China

²

Faculty of Information Technology and Electrical Engineering, University of Oulu, 90570 Oulu, Finland

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(19), 3951; https://doi.org/10.3390/electronics13193951

Submission received: 11 September 2024 / Revised: 2 October 2024 / Accepted: 7 October 2024 / Published: 7 October 2024

Download

Browse Figures

Versions Notes

Abstract

:

This research aims to address the limitations inherent in the traditional Extreme Learning Machine (ELM) algorithm, particularly the stochastic determination of input-layer weights and hidden-layer biases, which frequently leads to an excessive number of hidden-layer neurons and inconsistent performance. To augment the neural network’s efficacy in pattern classification, Principal Component Analysis (PCA) is employed to reduce the dimensionality of the input matrix and alleviate multicollinearity issues during the computation of the input weight matrix. This paper introduces an enhanced ELM methodology, designated the PCA-DP-ELM algorithm, which integrates PCA with Double Pseudo-Inverse Weight Determination (DP). The PCA-DP-ELM algorithm proposed in this study consistently achieves superior average classification accuracy across various datasets, irrespective of whether assessed through longitudinal or cross-sectional experiments. The results from both experimental paradigms indicate that the optimized algorithm not only enhances accuracy but also improves stability. These findings substantiate that the proposed methodology exerts a positive influence on pattern classification.

Keywords:

extreme learning machine; weight determination; pattern classification

1. Introduction

Pattern classification, a pivotal domain within artificial intelligence, has experienced profound transformations due to the advent of artificial neural networks, which have emerged as highly effective instruments in recent years. These networks are distinguished by their capacity for nonlinear system modeling, self-learning, and self-adaptation, facilitating their extensive integration into pattern classification methodologies [1,2]. Such techniques have exhibited efficacy across a broad spectrum of applications, including the resolution of complex equations [3,4,5,6], object detection [7,8,9], matrix computations [10,11,12,13,14], medical image analysis [15,16], analysis of public opinion [17,18], and the enhancement of intelligent control systems [19,20,21,22,23,24]. However, the traditional neural networks, such as backpropagation (BP) neural networks, while extensively utilized in practical applications [25], manifest inherent limitations, including sluggish convergence of iterative learning algorithms and susceptibility to local minima [26,27,28].

Additionally, swarm intelligence algorithms, including harmony search [29,30], cuckoo search [31,32], and sparrow search [33], have been widely employed to optimize detection performance. Nevertheless, the implementation of AI methodologies in industrial robotics is hindered by challenges such as the limited computational power of embedded platforms, stability issues, and stringent latency requirements. To address the limitations of the traditional neural networks, a specific class known as Single Hidden Layer Feedforward Neural Networks (SLFNs), particularly the Extreme Learning Machine (ELM), were introduced by Huang et al. in 2004 [26,34,35]. The ELM has demonstrated superior generalization performance across diverse applications, including disease diagnosis, traffic sign recognition, and image quality evaluation [36,37,38,39]. The rapid proliferation of the ELM and the extensive scholarly interest it has garnered can be attributed to its distinctive characteristics, including rapid training speed and enhanced generalization capability.

Typically, the ELM and analogous neural networks involve a process wherein input weights are generated randomly, followed by an analysis of the output weights. This study proposes an enhancement to the neural network architecture by determining the weights of both the input and output layers through pseudo-inverse calculations, culminating in the development of a novel algorithm termed the Double Pseudo-Inverse Weight Determination Extreme Learning Machine (DP-ELM) [35].

In practical applications, the escalation in the data sample dimensions often necessitates a corresponding increase in the number of hidden-layer nodes, which can adversely affect the efficiency and classification accuracy of the neural network [40]. To mitigate this challenge and sustain high classification performance, Principal Component Analysis (PCA) is employed to reduce the dimensionality of the input data [41]. This reduction aims to extract critical feature attributes from the samples, ultimately resulting in the creation of a neural network model referred to as the PCA-DP-ELM.

The subsequent sections of this paper are meticulously structured to explore specific facets of the proposed methodology in detail. Section 2 elucidates the foundational principles of PCA and the ELM. Section 3 introduces an enhanced variant of the ELM, designated as the PCA-DP-ELM algorithm. In Section 4, the application of the algorithm to pattern classification is demonstrated, with simulation results presented to underscore its efficacy. The paper concludes in Section 5 with a summary of the study’s principal contributions.

The primary contributions of this paper are delineated as follows:

The determination of weights for both the input and output layers in the ELM is accomplished through the application of pseudo-inverse calculations, with the efficacy of this approach having been rigorously validated.
A novel methodology has been proposed that utilizes Principal Component Analysis to optimize the input data of the DP-ELM, thereby augmenting its performance.

2. Preliminaries

2.1. Principal Component Analysis

PCA is widely recognized as a pre-eminent algorithm for dimensionality reduction [42]. Central to this methodology is the linear combination of the original variables, which facilitates the extraction of principal components and the subsequent formulation of an independent, novel, and comprehensive indicator. The resultant set of indicators encapsulate the majority of the original information, with each component exhibiting linear characteristics. Principal components not only effectively represent overarching phenomena but also play a pivotal role in reducing dimensionality, computational load, and temporal complexities. Consequently, they are routinely employed in the dimensionality reduction in high-dimensional datasets.

Let

S = {(s_{i j})}_{m \times n}

denote the sample data; the primary steps of Principal Component Analysis are as follows:

(1): Raw data are standardized.
(2): Correlation coefficient matrix $C = {[c_{p q}]}_{m \times n}$ , $(p, q = 1, 2, \dots, n)$ is obtained by calculation.

$c_{p q} = \frac{\sum_{i = 1}^{m} (s_{i p} - {\bar{s}}_{p}) (s_{i q} - {\bar{s}}_{q})}{\sqrt{\sum_{i = 1}^{m} {(s_{i p} - {\bar{s}}_{p})}^{2} \sum_{i = 1}^{m} {(s_{i q} - {\bar{s}}_{q})}^{2}}},$

(1)
(3): Eigenvalues $λ_{1}, λ_{2}, \dots, λ_{n}$ and their corresponding eigenvectors $p_{1}, p_{2}, \dots, p_{n}$ are calculated.
(4): The cumulative contribution rate of eigenvalues is calculated:

$γ_{h} = λ_{h} / \sum_{h = 1}^{n} λ_{h},$

(2)

$γ_{r} = \sum_{h = 1}^{k} λ_{h} / \sum_{h = 1}^{p} λ_{h},$

(3)

In Equation (3), $γ_{h}$ is the variance contribution rate of the hth principal component; $γ_{r}$ is the cumulative contribution rate of the first k principal components.
(5): The principal component expression is obtained. Generally, the dataset with a contribution rate greater than 85% is taken as the final dataset.

2.2. Extreme Learning Machine

The ELM is a rapid SLFN that exhibits robust generalization performance across a myriad of applications [36,37,38,39,43,44]. Owing to its accelerated training speed and remarkable generalization capabilities, the ELM has found extensive utility in domains such as disease diagnosis, traffic sign recognition, and image quality assessment [45,46]. The architecture of the neural network consists of three layers: the input layer, the hidden layer, and the output layer. A distinguishing characteristic of the ELM is its stochastic selection of input-layer parameters and hidden-layer biases during the network parameter determination process, while the output-layer weights are derived through direct computation.

The absence of iterative steps in the network parameter determination process significantly diminishes the time required for parameter optimization [26]. In contrast to conventional training methodologies, this approach affords the benefits of accelerated learning speed and enhanced generalization performance [47,48]. Nevertheless, despite these advantages, the ELM presents certain limitations that necessitate further enhancement. For example, when dealing with high-dimensional data samples, the ELM requires a substantial number of hidden-layer nodes, thereby considerably increasing the complexity of the network and diminishing its efficiency. As a consequence, a degree of randomness persists in parameter selection. Furthermore, the prevalent linearity issue intrinsic to the traditional ELM algorithm engenders instability in the output weights, which significantly undermines its generalization performance.

3. The Proposed PCA-DP-ELM

3.1. DP-ELM

To address the shortcomings of the conventional ELM algorithm—particularly the stochastic determination of input-layer weights and hidden-layer biases, which frequently results in an excessive quantity of hidden-layer neurons and erratic performance—this study introduces an enhanced ELM featuring Double Pseudo-Inverse Weight Determination. In this methodology, input weights are computed utilizing the pseudo-inverse technique, while output weights of the ELM are also ascertained through the pseudo-inverse, thereby ensuring optimal weight configuration for both the input and output layers. Additionally, a growth strategy is employed to identify the optimal number of neurons within the hidden layer. The assessment of hidden-layer neurons commences with a progressive increase, one neuron at a time. When the incorporation of a hidden-layer neuron no longer yields an enhancement in classification accuracy, the existing number of hidden-layer neurons is regarded as optimal, concluding the training process. Upon establishing the neuron count in the hidden layer, the complete neural network model is finalized.

A performance comparison between the proposed DP-ELM algorithm and the traditional ELM engenders the following conclusions:

(1): The accuracy of the optimized algorithm is enhanced.
(2): The number of hidden-layer neurons required to achieve optimal classification accuracy is reduced.
(3): The improved algorithm demonstrates greater stability.

As illustrated in Figure 1, this neural network is composed of three layers, which appears to be more streamlined than conventional neural networks. The input and output layers each contain J and K neurons, respectively, which employ straightforward linear activation functions. The hidden layer consists of M neurons that are activated by a monotonically nonlinear activation function

f (\cdot)

.

The weights connecting the mth hidden-layer neuron to the kth output-layer neuron are denoted by

b_{m k}

, where

m = 1, 2, \dots, M

and

k = 1, 2, \dots, K

. These output weights

b_{m k}

are randomly generated within the interval [

c 1, c 2

]. Additionally, the weights linking the jth input-layer neuron to the mth hidden-layer neuron are represented by

a_{j m}

, where

j = 1, 2, \dots, J

and

m = 1, 2, \dots, M

. These input weights

a_{j m}

are determined using the pseudo-inverse method, as elaborated in the subsequent section. Furthermore, the bias

h_{m}

of the mth hidden-layer neuron is randomly generated within the interval

[c 3, c 4]

, while the offset values in the input and output layers may be assigned a value of zero. Consequently, the output of a given neuron can be expressed as follows [49]:

y_{k} = \sum_{m = 1}^{M} b_{k m} f (\sum_{j = 1}^{J} a_{m j} x_{j} - h_{m}),

(4)

where

x_{j}

represents the input of the jth input-layer neuron. Equation (4) could be reformulated in the following concise form:

y = B f (A x - h),

(5)

where

y = {[y_{1}, y_{2}, \dots, y_{K}]}^{T} \in R^{K \times 1}

,

x = {[x_{1}, x_{2}, \dots, x_{J}]}^{T} \in R^{J \times 1}

,

h = {[h_{1}, h_{2}, \dots, h_{M}]}^{T} \in R^{M \times 1}

, the output weight matrix

B

and the input weight matrix

A

are defined as

B = [\begin{matrix} b_{11} & b_{12} & \dots & b_{1 M}^{} \\ b_{21} & b_{22} & \dots & b_{2 M}^{} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ b_{K 1}^{} & b_{K 2}^{} & \dots & b_{K M}^{} \end{matrix}] \in R^{K \times M},

(6)

A = [\begin{matrix} a_{11} & a_{12} & \dots & a_{1 J}^{} \\ a_{21} & a_{22} & \dots & a_{2 J}^{} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{M 1}^{} & a_{M 2}^{} & \dots & a_{M J}^{} \end{matrix}] \in R^{M \times J} .

(7)

In order to convert the known formula into the required matrix form, the number of experimental samples is set to N, and the output of the matrix form is obtained.

Y = B f (A X - H),

(8)

In these matrix-form equations,

Y = [y_{1}, y_{2}, \dots, y_{N}] \in R^{K \times N}

,

X = [x_{1}, x_{2}, \dots, x_{N}] \in R^{J \times N}

. And, the

H

denotes the matrix-form offsets of the hidden-layer neurons, where

H = [h, h, \dots, h] \in R^{M \times N}

.

Assume that activation function

f (\cdot)

is strictly monotonic. When the output weights

B

and deviations

H

are chosen from [

c 1, c 2

] and [

c 3, c 4

], respectively, the optimal input weight can be written as follows:

A = (f^{- 1} (B^{+} Y) + H) X^{+},

(9)

where

f^{- 1} (\cdot)

denotes the unique inverse function of

f (\cdot)

,

B^{+}

and

X^{+}

denote the pseudo-inverse of matrix

B

and

X

, respectively. After obtaining the input-layer weights

A

, continue to use the pseudo-inverse calculation method to reach the output-layer weights

\tilde{B}

.

\tilde{B} = Y {(f (A X - H))}^{+},

(10)

The following part briefly introduces the proof process of Equations (9) and (10).

Left multiplying

B^{+}

in both sides of Equation (8), we obtain

B^{+} Y = B^{+} B f (A X - H) = f (A X - H) .

(11)

Next, solving the inverse function of the above equation, one can obtain

f^{- 1} (B^{+} Y) = A X - H .

(12)

Then, the above equation can be rewritten as

A X = f^{- 1} (B^{+} Y) + H .

(13)

Finally, right multiplying

X^{+}

in both sides of the above equation, we have

A X X^{+} = (f^{- 1} (B^{+} Y) + H) X^{+},

(14)

that is,

A = (f^{- 1} (B^{+} Y) + H) X^{+} .

(15)

The proof process is completed.

After the input-layer weight

A

is obtained by the pseudo-inverse calculation and the output-layer weight

\tilde{B}

is determined again by the pseudo-inverse calculation method, the calculation formula of output-layer weight is as follows:

\tilde{B} = Y {(f (A X - H))}^{+},

(16)

So far, the weights of the output and input layers of the ELM have been obtained by analytical calculation.

3.2. The Model of PCA-DP-ELM

After determining the weights of the input and output layers using the pseudo-inverse method, the resulting DP-ELM achieves higher classification accuracy compared to the traditional ELM in data classification tasks. However, if the input data contain an excessive number of features, the program’s running time during the pattern classification process will significantly exceed that of the traditional ELM. To enhance the model’s classification accuracy without increasing the running time, the previously mentioned PCA method is employed to further optimize the DP-ELM network model. PCA is utilized to eliminate correlations between input data and reduce data dimensionality, resulting in the final neural network model referred to as PCA-DP-ELM.

The steps of the PCA-DP-ELM modeling process are summarized as follows:

STEP 1: Use the PCA method to reduce the dimensionality of the input data.

(1): Obtain the original high-dimensional input dataset.
(2): Calculate the eigenvalues and eigenvectors of the covariance matrix.
(3): Determine the cumulative contribution rate and estimate the number of principal components to be selected, ensuring that the cumulative contribution rate exceeds 85%.
(4): Based on the contribution rate, identify the multi-dimensional data that need to be analyzed in the next step.

STEP 2: Build an improved ELM network model.

(1): Use the data after PCA-based dimensionality reduction as the input data for the ELM.
(2): Randomly generate the weights of the ELM output layer according to Equation (9):
$A = (f^{- 1} (B^{+} Y) + H) X^{+}$
and calculate the optimal weight $A$ of the input layer to the ELM through the pseudo-inverse calculation.
(3): After obtaining the optimal input weight $A$ , Equation (10), $\tilde{B} = Y {(f (A X - H))}^{+}$ , is used to obtain the optimal output weight $\tilde{B}$ .

4. Comparative Experiments

This section delineates the numerical experiments conducted on pattern classification to assess the efficacy and superiority of the proposed PCA-DP-ELM algorithm. For the purpose of comparative analysis, the ELM, PCA-ELM, and DP-ELM algorithms are also incorporated into the experiments.

To ensure that the findings are both generalizable and robust, the classification accuracy of the PCA-DP-ELM is juxtaposed with that of other neural network models, including SOCPNN-W, MOCPNN-W, MLP-ELM, MLP-LM, as well as Regularized RBFNN and SVM [43,50,51,52,53].

The performance of the proposed algorithm is validated utilizing the MATLAB platform. All the numerical experiments are executed in MATLAB version 8.3.0, operating on a personal computer equipped with an Intel i7-8700 CPU at 3.20 GHz. A comprehensive experimental comparison is undertaken, concentrating on various metrics, such as the predictive accuracy, requisite number of hidden-layer neurons, overall algorithm performance, and program execution time.

4.1. Experimental Data

In this experiment, eight datasets were selected from the UCI database to test and evaluate the classification performance of the improved Extreme Learning Machine, including four binary classification datasets and four multi-classification datasets. The detailed description of each dataset is shown in Table 1.

In the execution of the pattern classification experiments, 70% of the dataset is randomly allocated as the training dataset, while the remaining 30% is designated as the testing dataset. Employing metrics such as range, variance, number of hidden-layer neurons, model execution time, and average classification accuracy, a comprehensive comparison of various neural network models is performed through both horizontal and vertical experimental methodologies.

4.2. Longitudinal Experimental Comparison

In this subsection, we conduct a comparative analysis of the ELM, PCA-ELM, DP-ELM, and PCA-DP-ELM algorithms. This procedure, characterized by a progressive optimization from the ELM to the PCA-DP-ELM neural network model, is referred to as a longitudinal experimental comparison.

When configuring the neural network architecture, the number of neurons in the input layer corresponds to the number of features in the dataset, while the number of neurons in the output layer represents the distinct classes within the dataset. Thus, the configuration of the input and output-layer neurons is governed by the specific characteristics of the dataset.

Initially, the number of hidden-layer neurons is set to one, with the classification accuracy recorded as this number is incrementally increased. The optimal number of hidden-layer neurons is identified as the value that maximizes the classification accuracy. In this experiment, the tangent function is utilized as the activation function, with the arctangent function serving as its inverse.

The classification accuracy values reported in Table 2 reflect the mean accuracy of the test set, derived from the algorithm’s execution over 100 iterations utilizing the optimal network architecture. The findings presented in Table 2 indicate that the PCA-DP-ELM neural network model demonstrates superior efficacy in pattern classification compared to the other neural networks evaluated, across both the binary and multi-class classification datasets. Among the eight datasets employed in this experiment, the PCA-DP-ELM achieves the highest classification accuracy in five datasets, while its performance in the remaining three datasets ranks second.

Table 3 delineates the number of hidden-layer neurons requisite for the four algorithms to achieve the predetermined classification accuracy. The results reveal that the proposed algorithm requires fewer hidden-layer neurons to attain the highest classification accuracy, culminating in a more streamlined network architecture. This finding implies that the analytical approach adopted in this study is more efficacious than the random method employed for determining input weights.

The classification stability of various neural networks may fluctuate across distinct datasets. Table 4 illustrates the range and variance of the neural network outputs when the number of hidden-layer nodes is configured to 10, 50, and 100, respectively. Comprehensive data are detailed in Table 4.

The range and variance metrics indicate the extent of the variability in the algorithm’s output results. Reduced values signify enhanced algorithmic stability. As evidenced by the data in Table 4, among the 24 sets of range and variance values for the hidden-layer nodes configured at 10, 50, and 100, 16 sets corresponding to the PCA-DP-ELM neural network exhibit the smallest ranges or variances, with the minimum values emphasized in bold in Table 4. This suggests that the PCA-optimized DP-ELM neural network provides superior stability in pattern classification.

In this subsection, a thorough comparative analysis of the algorithm performance is conducted utilizing the SL dataset for a multi-class classification challenge. This investigation aims to further elucidate the impact of algorithm parameters on classification efficacy and stability. Figure 2 presents the findings of this comparative experiment on the SL dataset.

Figure 2 delineates the classification outcomes of various neural networks applied to the multi-class SL dataset. The analysis demonstrates that the PCA-DP-ELM neural network attains superior classification accuracy relative to the other ELM variants. As the number of hidden-layer neurons increases, the classification accuracy of the neural networks exhibits a tendency to stabilize.

The data illustrated in Figure 2 indicate that the prediction accuracy of both the traditional ELM and the proposed PCA-DP-ELM algorithm initially rises rapidly but subsequently either plateaus or declines as the number of hidden-layer neurons continues to escalate. This observation suggests that, while the enhanced ELM algorithm exhibits characteristics analogous to those of the traditional ELM—specifically, improved fitting performance with an increase in the hidden-layer neurons—further increments beyond an optimal threshold may precipitate overfitting. However, this study employs the growth method to ascertain the appropriate number of hidden-layer neurons, which has effectively contributed to mitigating overfitting.

The range and variance values for the PCA-DP-ELM neural network model regarding classification remain consistently low and stabilize, indicating that the neural network model, which employs PCA for dimensionality reduction followed by classification with the DP-ELM, demonstrates superior performance in terms of classification accuracy and stability.

Following the application of PCA to preprocess the input data in conjunction with the ELM, it is observed that, despite the incorporation of a dimensionality reduction step in the pattern classification process, the enhanced neural network model utilizing PCA demonstrates a reduction in experimental execution time, as evidenced by the runtime statistics. Table 5 presents the time required for the model to execute 100 iterations across various datasets.

Upon analysis, it is discerned that the introduction of DP to the original ELM model contributes to an increase in program runtime, likely attributable to the inclusion of two pseudo-inverse calculations. However, the integration of PCA with the DP-ELM framework reveals a notable decrease in the runtime of the PCA-DP-ELM model compared to that of the DP-ELM, indicating an enhancement in computational efficiency alongside an improvement in classification accuracy.

4.3. Transverse Experimental Comparison

In this subsection, we conduct a comparative analysis of the performance of the proposed PCA-DP-ELM algorithm against several established neural networks, including SOCPNN-W, MOCPNN-W, MLP-ELM, MLP-LM, RBFNN, and SVM, utilizing all the real-world classification datasets delineated in Table 1. This analysis, which evaluates the efficacy of optimized neural network models, is referred to as a horizontal experimental comparison. To mitigate the impact of random parameter initialization, we perform 100 trials for each algorithm, with the average results presented in Table 6.

Six datasets are employed for testing purposes. As indicated in Table 6, the PCA-DP-ELM consistently achieves the highest or second-highest classification accuracy across the majority of the datasets. In instances where the PCA-DP-ELM does not attain optimal performance, its accuracy remains in close proximity to the best results. To comprehensively assess the generalization performance of these seven neural networks, we employ an average ranking method [54], wherein lower rankings correspond to superior performance. According to Table 6, the PCA-DP-ELM achieves an average ranking of 1.33 (the lowest value), signifying that it exhibits the most favorable overall performance in pattern classification among the seven neural networks evaluated. Consequently, these results substantiate the efficacy and superiority of the PCA-DP-ELM in the domain of pattern classification. In summary, the experimental findings validate both the enhanced generalization performance and the effectiveness of the simplified architecture of the PCA-DP-ELM.

5. Conclusions

In this study, we present and scrutinize the enhanced PCA-DP-ELM algorithm, underscoring its effective implementation in pattern classification. During the parameter determination process of the DP-ELM, the weights of the input and output layers are computed sequentially using the pseudo-inverse method. This approach not only refines the theoretical foundations underlying the mathematical derivation but also enhances the practical performance. Furthermore, we integrate the PCA algorithm with the DP-ELM to further diminish the dimensionality of the input data.

The outcomes from the pattern classification applications illustrate that the PCA-DP-ELM effectively classifies data. Extensive experimental results on real-world classification datasets reveal that the PCA-DP-ELM offers superior generalization performance while maintaining a streamlined structure. The PCA-DP-ELM achieves higher accuracy in pattern classification; in longitudinal experiments, the classification ranking of the network model proposed in this paper is 1.5, which exceeds the accuracy ranking of the second-ranked ELM, optimized solely with DP, by 0.625. In horizontal experiments, the PCA-DP-ELM also attains the highest average classification accuracy among the seven network models, with an average classification ranking of 1.33. Compared to other neural network models, the PCA-DP-ELM consistently demonstrates the highest average classification accuracy in both longitudinal and cross-sectional experiments while requiring fewer hidden-layer neurons.

The methodology proposed in this article exhibits commendable performance on high-dimensional datasets; however, it has not yet been extensively evaluated or applied to more complex datasets. Additionally, while the Double Pseudo-Inverse Weight Determination method enhances the classification accuracy, it also increases the computation time on large-scale datasets, resulting in diminished computational efficiency. This challenge represents not only an issue to be addressed but also a focal point for future research.

Moreover, this refined ELM demonstrates stable and efficient performance across various real-world datasets. By preprocessing data through PCA to eliminate noise and redundant information, and subsequently employing the ELM for classification, this methodology promises favorable outcomes in fault recognition within complex industrial processes, maintaining rapid training speed and significantly enhancing the classification accuracy. In smart home control systems, by learning residents’ behavioral patterns and preferences, home devices can be automatically regulated to enhance quality of life while minimizing energy waste. In summary, the application of the PCA-DP-ELM in pattern recognition and intelligent control is both diverse and effective. It provides robust technical support for various application scenarios by improving the speed and accuracy of data processing, positioning it as a pivotal tool for future applications in intelligent optimization, intelligent control, pattern recognition, and related domains.

Author Contributions

Conceptualization, Y.Y. and B.L.; methodology, Y.Y. and S.L.; software, Y.Y. and S.L.; validation, Y.Y. and J.Z.; formal analysis, Y.Y.; investigation, Y.Y.; resources, B.L.; data curation, Y.Y.; writing—original draft preparation, Y.Y. and J.Z.; writing—review and editing, Y.Y. and J.Z.; visualization, Y.Y. and J.Z.; supervision, B.L.; project administration, B.L.; funding acquisition, Y.Y., J.Z. and B.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62066015, Research Foundation of Education Bureau of Hunan Province under Grant 22C0277 and Students Innovation and Entrepreneurship Training Program of Hunan Province under Grant S202410531098S. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, H.; Liu, J.; Ma, D.; Wang, Z. Data-Core-Based Fuzzy Min–Max Neural Network for Pattern Classification. IEEE Trans. Neural Netw. 2011, 22, 2339–2352. [Google Scholar] [CrossRef] [PubMed]
Ou, G.; Murphey, Y.L. Multi-class pattern classification using neural networks. Pattern Recognit. 2007, 40, 4–18. [Google Scholar] [CrossRef]
Xiao, L.; Liao, B.; Li, S.; Chen, K. Nonlinear recurrent neural networks for finite-time solution of general time-varying linear matrix equations. Neural Netw. 2018, 98, 102–113. [Google Scholar] [CrossRef]
Zhang, Z.; Zheng, L.; Weng, J.; Mao, Y.; Lu, W.; Xiao, L. A new varying-parameter recurrent neural-network for online solution of time-varying Sylvester equation. IEEE Trans. Cybern. 2018, 48, 3135–3148. [Google Scholar] [CrossRef]
Long, C.; Zhang, G.; Zeng, Z.; Hu, J. Finite-time stabilization of complex-valued neural networks with proportional delays and inertial terms: A non-separation approach. Neural Netw. 2022, 148, 86–95. [Google Scholar] [CrossRef]
Li, W.; Xiao, L.; Liao, B. A Finite-Time Convergent and Noise-Rejection Recurrent Neural Network and Its Discretization for Dynamic Nonlinear Equations Solving. IEEE Trans. Cybern. 2020, 50, 3195–3207. [Google Scholar] [CrossRef]
Zhou, J.; Ning, J.; Xiang, Z.; Yin, P. ICDW-YOLO: An Efficient Timber Construction Crack Detection Algorithm. Sensors 2024, 24, 4333. [Google Scholar] [CrossRef]
Wu, Z.; Guo, K.; Wang, L.; Hu, M.; Ren, S. A Collaborative Learning-based Urban Low-light Small-target Face Image Enhancement Method. ACM Trans. Sen. Netw. 2023. Just Accepted. [Google Scholar] [CrossRef]
Zhu, X.; Guo, K.; Ren, S.; Hu, B.; Hu, M.; Fang, H. Lightweight Image Super-Resolution with Expectation-Maximization Attention Mechanism. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 1273–1284. [Google Scholar] [CrossRef]
Liao, B.; Zhang, Y. Different complex ZFs leading to different complex ZNN models for time-varying complex generalized inverse matrices. IEEE Trans. Neural Netw. Learn. Syst. 2013, 25, 1621–1631. [Google Scholar] [CrossRef]
Liao, B.; Zhang, Y. From different ZFs to different ZNN models accelerated via Li activation functions to finite-time convergence for time-varying matrix pseudoinversion. Neurocomputing 2014, 133, 512–522. [Google Scholar] [CrossRef]
Xiao, L.; He, Y.; Dai, J.; Liu, X.; Liao, B.; Tan, H. A Variable-Parameter Noise-Tolerant Zeroing Neural Network for Time-Variant Matrix Inversion with Guaranteed Robustness. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 1535–1545. [Google Scholar] [CrossRef]
Xiao, L.; Tan, H.; Jia, L.; Dai, J.; Zhang, Y. New error function designs for finite-time ZNN models with application to dynamic matrix inversion. Neurocomputing 2020, 402, 395–408. [Google Scholar] [CrossRef]
Liao, B.; Han, L.; Cao, X.; Li, S.; Li, J. Double integral-enhanced Zeroing neural network with linear noise rejection for time-varying matrix inverse. CAAI Trans. Intell. Technol. 2023, 9, 197–210. [Google Scholar] [CrossRef]
Sun, L.; Mo, Z.; Yan, F.; Xia, L.; Shan, F.; Ding, Z.; Song, B.; Gao, W.; Shao, W.; Shi, F.; et al. Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT. IEEE J. Biomed. Health Inform. 2020, 24, 2798–2805. [Google Scholar] [CrossRef]
Ren, S.; Guo, K.; Zhou, X.; Hu, B.; Zhu, F.; Luo, E. Medical Image Super-Resolution Based on Semantic Perception Transfer Learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 2023, 20, 2598–2609. [Google Scholar] [CrossRef]
Liu, Z.; Wu, X. Structural analysis of the evolution mechanism of online public opinion and its development stages based on machine learning and social network analysis. Int. J. Comput. Intell. Syst. 2023, 16, 99. [Google Scholar] [CrossRef]
Wang, S.; Zhao, X.; Chen, Y.; Li, Z.; Zhang, K.; Xia, J. Negative influence minimizing by blocking nodes in social networks. In Proceedings of the 17th AAAI Conference on Late-Breaking Developments in the Field of Artificial Intelligence, Bellevue, WA, USA, 14–18 July 2013; AAAIWS’13-17. AAAI Press: Washington, DC, USA, 2013; pp. 134–136. [Google Scholar]
Jin, L.; Zhang, Y.; Li, S.; Zhang, Y. Modified ZNN for Time-Varying Quadratic Programming with Inherent Tolerance to Noises and Its Application to Kinematic Redundancy Resolution of Robot Manipulators. IEEE Trans. Ind. Electron. 2016, 63, 6978–6988. [Google Scholar] [CrossRef]
Zhang, Y.; Li, S.; Kadry, S.; Liao, B. Recurrent neural network for kinematic control of redundant manipulators with periodic input disturbance and physical constraints. IEEE Trans. Cybern. 2018, 49, 4194–4205. [Google Scholar] [CrossRef]
Chen, J.; Teo, T.H.; Kok, C.L.; Koh, Y.Y. A Novel Single-Word Speech Recognition on Embedded Systems Using a Convolution Neuron Network with Improved Out-of-Distribution Detection. Electronics 2024, 13, 530. [Google Scholar] [CrossRef]
Chen, B.; Lu, S.; Zhong, P.; Cui, Y.; Liang, Y.; Wang, J. SemNav-HRO: A target-driven semantic navigation strategy with human–robot–object ternary fusion. Eng. Appl. Artif. Intell. 2024, 127, 107370. [Google Scholar] [CrossRef]
Liu, M.; Li, Y.; Chen, Y.; Qi, Y.; Jin, L. A Distributed Competitive and Collaborative Coordination for Multirobot Systems. IEEE Trans. Mob. Comput. 2024, 1–13. [Google Scholar] [CrossRef]
Liao, B.; Hua, C.; Xu, Q.; Cao, X.; Li, S. Inter-robot management via neighboring robot sensing and measurement using a zeroing neural dynamics approach. Expert Syst. Appl. 2024, 244, 122938. [Google Scholar] [CrossRef]
Ye, Y.; Wang, P.; Wu, N.; Ge, F.; Zhou, F. Optimisation for accuracy improving of MSB signal detection. Electron. Lett. 2017, 53, 1578–1580. [Google Scholar] [CrossRef]
Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
Wilson, D.; Martinez, T. The need for small learning rates on large problems. In Proceedings of the IJCNN’01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222), Washington, DC, USA, 15–19 July 2001; Volume 1, pp. 115–119. [Google Scholar]
Yu, X.H.; Chen, G.A. On the local minima free condition of backpropagation learning. IEEE Trans. Neural Netw. 1995, 6, 1300–1303. [Google Scholar] [CrossRef]
Ye, S.; Zhou, K.; Zain, A.M.; Wang, F.; Yusoff, Y. A modified harmony search algorithm and its applications in weighted fuzzy production rule extraction. Front. Inf. Technol. Electron. Eng. 2023, 24, 1574–1590. [Google Scholar] [CrossRef]
Qin, F.; Zain, A.M.; Zhou, K.Q. Harmony search algorithm and related variants: A systematic review. Swarm Evol. Comput. 2022, 74, 101126. [Google Scholar] [CrossRef]
Ye, S.Q.; Zhou, K.Q.; Zhang, C.X.; Mohd Zain, A.; Ou, Y. An improved multi-objective cuckoo search approach by exploring the balance between development and exploration. Electronics 2022, 11, 704. [Google Scholar] [CrossRef]
Zhang, C.X.; Zhou, K.Q.; Ye, S.Q.; Zain, A.M. An Improved Cuckoo Search Algorithm Utilizing Nonlinear Inertia Weight and Differential Evolution for Function Optimization Problem. IEEE Access 2021, 9, 161352–161373. [Google Scholar] [CrossRef]
Zhang, X.Y.; Zhou, K.Q.; Li, P.C.; Xiang, Y.H.; Zain, A.M.; Sarkheyli-Hägele, A. An improved chaos sparrow search optimization algorithm using adaptive weight modification and hybrid strategies. IEEE Access 2022, 10, 96159–96179. [Google Scholar] [CrossRef]
Feng, G.; Huang, G.B.; Lin, Q.; Gay, R. Error Minimized Extreme Learning Machine with Growth of Hidden Nodes and Incremental Learning. IEEE Trans. Neural Netw. 2009, 20, 1352–1357. [Google Scholar] [CrossRef] [PubMed]
Lu, R.; Luo, L.; Liao, B. Voting based double-weighted deterministic extreme learning machine model and its application. Front. Neurorobot. 2023, 17, 1322645. [Google Scholar] [CrossRef] [PubMed]
Saraswathi, S.; Sundaram, S.; Sundararajan, N.; Zimmermann, M.; Nilsen-Hamilton, M. ICGA-PSO-ELM Approach for Accurate Multiclass Cancer Classification Resulting in Reduced Gene Sets in Which Genes Encoding Secreted Proteins Are Highly Represented. IEEE/ACM Trans. Comput. Biol. Bioinform. 2011, 8, 452–463. [Google Scholar] [CrossRef]
Yang, H.; Yi, J.; Zhao, J.; Dong, Z. Extreme learning machine based genetic algorithm and its application in power system economic dispatch. Neurocomputing 2013, 102, 154–162. [Google Scholar] [CrossRef]
Jin, B.; Jing, Z.; Zhao, H. Incremental and Decremental Extreme Learning Machine Based on Generalized Inverse. IEEE Access 2017, 5, 20852–20865. [Google Scholar] [CrossRef]
Huang, F.; Lu, J.; Tao, J.; Li, L.; Tan, X.; Liu, P. Research on Optimization Methods of ELM Classification Algorithm for Hyperspectral Remote Sensing Images. IEEE Access 2019, 7, 108070–108089. [Google Scholar] [CrossRef]
Guang-Yong, G.; Guo-Ping, J. Prediction of multivariable chaotic time series using optimized extreme learning machine. Acta Phys. Sin. 2012, 61. [Google Scholar] [CrossRef]
Greenacre, M.; Groenen, P.J.; Hastie, T.; d’Enza, A.I.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Prim. 2022, 2, 100. [Google Scholar] [CrossRef]
Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
Zhang, Y.; Yin, Y.; Guo, D.; Yu, X.; Xiao, L. Cross-validation based weights and structure determination of Chebyshev-polynomial neural networks for pattern classification. Pattern Recognit. 2014, 47, 3414–3428. [Google Scholar] [CrossRef]
Huang, Z.; Yu, Y.; Gu, J.; Liu, H. An Efficient Method for Traffic Sign Recognition Based on Extreme Learning Machine. IEEE Trans. Cybern. 2017, 47, 920–933. [Google Scholar] [CrossRef] [PubMed]
Lyu, S.; Cheung, R.C.C. Efficient Multiple Channels EEG Signal Classification Based on Hierarchical Extreme Learning Machine. Sensors 2023, 23, 8976. [Google Scholar] [CrossRef] [PubMed]
Lyu, S.; Cheung, R.C.C. Efficient and Automatic Breast Cancer Early Diagnosis System Based on the Hierarchical Extreme Learning Machine. Sensors 2023, 23, 7772. [Google Scholar] [CrossRef]
Zhang, S.; Liu, Z.; Huang, X.; Xiao, W. A Modified Residual Extreme Learning Machine Algorithm and Its Application. IEEE Access 2018, 6, 62215–62223. [Google Scholar] [CrossRef]
Chen, L.; Huang, Z.; Li, Y.; Zeng, N.; Liu, M.; Peng, A.; Jin, L. Weight and Structure Determination Neural Network Aided with Double Pseudoinversion for Diagnosis of Flat Foot. IEEE Access 2019, 7, 33001–33008. [Google Scholar] [CrossRef]
Part I: The General Theory and Computational Methods. In Regression and the Moore-Penrose Pseudoinverse; Albert, A. (Ed.) Mathematics in Science and Engineering; Elsevier: Amsterdam, The Netherlands, 1972; Volume 94, p. 1. [Google Scholar] [CrossRef]
Phurattanaprapin, K.; Horata, P. Extended hierarchical extreme learning machine with multilayer perceptron. In Proceedings of the 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), Khon Kaen, Thailand, 13–15 July 2016; pp. 1–5. [Google Scholar]
Wang, D.; Lu, W.Z. Forecasting of ozone level in time series using MLP model with a novel hybrid training algorithm. Atmos. Environ. 2006, 40, 913–924. [Google Scholar] [CrossRef]
Lowe, D.; Broomhead, D. Multivariable functional interpolation and adaptive networks. Complex Syst. 1988, 2, 321–355. [Google Scholar]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Brazdil, P.; Soares, C. A Comparison of Ranking Methods for Classification Algorithm Selection. In Proceedings of the 11th European Conference on Machine Learning, Barcelona, Spain, 31 May–2 June 2000; Springer: Berlin/Heidelberg, Germany, 2000. ECML ’00. pp. 63–74. [Google Scholar]

Figure 1. Structure of DP-ELM.

Figure 2. Comparison experiment on the SL dataset.

Table 1. Features of different real-world classification datasets.

Dataset	No. of Attributes	No. of Classes	No. of Instances
(Pima Indians Diabetes) PID	8	2	768
(Planning Relax) PR	12	2	182
Parkinsons	22	2	195
Ionosphere	34	2	351
Iris	4	3	150
Glass	9	7	214
Zoo	16	7	100
(Soybean Large) SL	35	19	186

Table 2. Comparison of classification accuracy.

Dataset	Average Classification Accuracy (%)/Rank of Testing Classification Accuracy
Dataset	ELM	PCA-ELM	DP-ELM	PCA-DP-ELM
PID	67.3/4	72.3/3	76.8/1	75.1/2
PR	74.54/1	74.54/1	74.54/1	74.54/1
Parkinsons	77.67/4	79.98/3	85.93/1	84.48/2
Ionosphere	89.78/3	89.61/4	90.01/2	91.63/1
Iris	97.11/4	98.82/2	97.66/3	99.89/1
Glass	69.54/1	66.45/3	66.03/4	68.14/2
Zoo	99.94/1	98.80/4	99.85/3	99.93/2
SL	81.7/3	80.8/4	83.5/2	85.2/1
Avg. Rank	2.625	2.875	2.125	1.5

Table 3. Comparison of number of hidden-layer neurons.

Dataset	ELM	PCA-ELM	DP-ELM	PCA-DP-ELM
PID	96	45	3	3
PR	1	1	1	1
Parkinsons	36	20	3	2
Ionosphere	15	20	6	6
Iris	21	8	16	9
Glass	90	50	19	52
Zoo	97	97	9	12
SL	68	47	62	54

Table 4. Comparison of range values and variance values.

Dataset	Range Value/Variance Value
Dataset	No. of Hidden Neurons	ELM	PCA-ELM	DP-ELM	PCA-DP-ELM
PID	10	1.56/6.49	6.51/10.82	0.09/1.30	0.13/0.87
	50	4.01/9.96	2.69/8.23	0.19/2.17	0.02/0.43
	100	4.56/9.52	3.66/10.39	0.01/0.43	0/0
PR	10	5.68/34.38	5.24/20.31	2.04/12.50	0.80/18.75
	50	15.15/23.44	12.89/17.19	11.06/15.65	9.73/18.75
	100	13.69/15.63	27.89/23.46	12.22/17.19	23.82/23.44
Parkinsons	10	7.99/18.97	11.35/15.52	0.90/5.17	0.060/1.72
	50	21.09/22.41	27.89/25.86	0/0	0/0
	100	24.32/22.41	41.27/32.76	0/0	0/0
Ionosphere	10	1.21/6.25	0.69/4.55	0.34/2.27	0.11/1.14
	50	3.79/9.09	3.57/10.23	0.01/1.71	0.20/1.14
	100	6.192/11.36	4.729/12.5	0.255/2.273	0/0
Iris	10	5.39/13.33	2.06/4.44	1.41/4.44	1.31/4.44
	50	10.98/15.56	9.78/17.78	4.50/8.89	0.56/4.44
	100	58.87/33.33	50.13/33.33	7.0/13.33	0.25/4.44
Glass	10	76.42/42.19	32.69/23.44	13.51/18.75	10.67/12.50
	50	21.37/21.88	16.79/21.88	4.31/10.94	16.32/21.88
	100	11.94/18.75	44.31/34.38	18.57/20.31	16.36/21.88
Zoo	10	31.12/29.03	15.53/16.13	0.92/6.45	0.21/3.23
	50	33.92/25.81	215.30/74.19	7.65/16.13	181.10/80.65
	100	0.61/6.45	17.05/35.48	7.20/9.68	61.58/25.81
SL	10	19.77/20.92	23.76/22.88	7.49/13.07	12.27/21.57
	50	4.52/9.15	5.14/12.42	3.27/8.50	1.77/7.19
	100	7.34/13.07	8.10/13.73	4.44/11.11	3.22/8.50

Table 5. Running time.

Dataset	ELM	PCA-ELM	DP-ELM	PCA-DP-ELM
PID	0.086387	0.092243	0.12912	0.12025
PR	0.040045	0.039521	0.054082	0.048847
Parkinsons	0.053179	0.053207	0.08619	0.062625
Ionosphere	0.059135	0.058959	0.11851	0.10583
Iris	0.037682	0.037408	0.046905	0.045851
Glass	0.049932	0.04977	0.068844	0.064767
Zoo	0.036489	0.036112	0.053208	0.050403
SL	0.059824	0.059587	0.12384	0.094165

Table 6. Comparison of testing classification accuracy.

Dataset	Average Classification Accuracy (%) / Rank of Testing Classification Accuracy
Dataset	PCA-DP-ELM	SOCPNN-W	MOCPNN-W	MLP-ELM	MLP-LM	RBFNN	SVM
PID	$75.10 / 3$	$76.29 / 1$	$76.29 / 1$	$64.56 / 6$	$69.02 / 4$	$60.66 / 7$	$65.10 / 5$
Ionosphere	$91.63 / 1$	$86.30 / 3$	$86.30 / 3$	$82.90 / 7$	$85.24 / 6$	$86.19 / 5$	$91.41 / 2$
Iris	$99.89 / 1$	$97.08 / 2$	$95.35 / 4$	$94.38 / 6$	$95.33 / 5$	$94.16 / 7$	$96.56 / 3$
Glass	$68.14 / 1$	$47.21 / 6$	$62.09 / 4$	$45.14 / 7$	$62.89 / 2$	$57.87 / 5$	$62.76 / 3$
Zoo	$99.93 / 1$	$93.37 / 3$	$92.48 / 4$	$86.31 / 7$	$89.66 / 5$	$95.00 / 2$	$87.54 / 6$
SL	$85.20 / 1$	$29.97 / 7$	$78.80 / 3$	$63.86 / 6$	$75.12 / 4$	$84.47 / 2$	$74.91 / 5$
Avg. Rank	1.33	3.67	3.17	6.50	4.33	4.67	4.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yin, Y.; Liao, B.; Li, S.; Zhou, J. Research on Pattern Classification Based on Double Pseudo-Inverse Extreme Learning Machine. Electronics 2024, 13, 3951. https://doi.org/10.3390/electronics13193951

AMA Style

Yin Y, Liao B, Li S, Zhou J. Research on Pattern Classification Based on Double Pseudo-Inverse Extreme Learning Machine. Electronics. 2024; 13(19):3951. https://doi.org/10.3390/electronics13193951

Chicago/Turabian Style

Yin, Yumin, Bolin Liao, Shuai Li, and Jieyang Zhou. 2024. "Research on Pattern Classification Based on Double Pseudo-Inverse Extreme Learning Machine" Electronics 13, no. 19: 3951. https://doi.org/10.3390/electronics13193951

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Pattern Classification Based on Double Pseudo-Inverse Extreme Learning Machine

Abstract

1. Introduction

2. Preliminaries

2.1. Principal Component Analysis

2.2. Extreme Learning Machine

3. The Proposed PCA-DP-ELM

3.1. DP-ELM

3.2. The Model of PCA-DP-ELM

4. Comparative Experiments

4.1. Experimental Data

4.2. Longitudinal Experimental Comparison

4.3. Transverse Experimental Comparison

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI