Next Article in Journal
Nonlinear Modeling and Vibration Response Analysis of a Dual-Rotor System with an Inter-Shaft Graphite Seal
Next Article in Special Issue
Optimizing the Three-Dimensional Multi-Objective of Feeder Bus Routes Considering the Timetable
Previous Article in Journal
Fusion Q-Learning Algorithm for Open Shop Scheduling Problem with AGVs
Previous Article in Special Issue
Mathematical Models for the Design of GRID Systems to Solve Resource-Intensive Problems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Enhanced FCM Clustering Method Based on Multi-Strategy Tuna Swarm Optimization

1
QiLu Aerospace Information Research Institute, Jinan 250101, China
2
Department of Electrical Information, Shandong University of Science and Technology, Jinan 250031, China
3
Faculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 611756, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(3), 453; https://doi.org/10.3390/math12030453
Submission received: 4 January 2024 / Revised: 21 January 2024 / Accepted: 22 January 2024 / Published: 31 January 2024

Abstract

:
To overcome the shortcoming of the Fuzzy C-means algorithm (FCM)—that it is easy to fall into local optima due to the dependence of sub-spatial clustering on initialization—a Multi-Strategy Tuna Swarm Optimization-Fuzzy C-means (MSTSO-FCM) algorithm is proposed. Firstly, a chaotic local search strategy and an offset distribution estimation strategy algorithm are proposed to improve the performance, enhance the population diversity of the Tuna Swarm Optimization (TSO) algorithm, and avoid falling into local optima. Secondly, the search and development characteristics of the MSTSO algorithm are introduced into the fuzzy matrix of Fuzzy C-means (FCM), which overcomes the defects of poor global searchability and sensitive initialization. Not only has the searchability of the Multi-Strategy Tuna Swarm Optimization algorithm been employed, but the fuzzy mathematical ideas of FCM have been retained, to improve the clustering accuracy, stability, and accuracy of the FCM algorithm. Finally, two sets of artificial datasets and multiple sets of the University of California Irvine (UCI) datasets are used to do the testing, and four indicators are introduced for evaluation. The results show that the MSTSO-FCM algorithm has better convergence speed than the Tuna Swarm Optimization Fuzzy C-means (TSO-FCM) algorithm, and its accuracies in the heart, liver, and iris datasets are 89.46%, 63.58%, 98.67%, respectively, which is an outstanding improvement.

1. Introduction

As technology improves by leaps and bounds, the emergence of data expansion has had a definite impact on all walks of life. Countries around the world have gradually paid attention to the analysis of data and internal knowledge. A commonly used data analysis method, data mining, is used to transform the original data into useful data or information through specific identification methods [1]. The existing supervised learning has a very strong dependence on data labels. If the data labels are accurate, the supervised algorithm can learn well, but when the labels are wrong, the supervised algorithm is difficult to analyze, and the internal knowledge mined will produce deviations. The ways of labeling massive data and mining internal connections are the unique function of clustering algorithms [2].
The current clustering algorithms are divided into five categories: density-based methods, grid-based methods, hierarchical-based methods, model-based methods, and division-based methods [3]. The Fuzzy C-means clustering method is a division-based clustering method, which is widely used in the segmentation and classification of brain tumors [4], the evaluation of power grid reliability performance [5], and the recognition of the whitewashing behavior of college accounting statements [6]. However, because the FCM algorithm is a hard subspace clustering algorithm, its computational complexity is high. In addition, it is not sensitive to noise, and the processing effect of high data is poor. It also does not overcome the dependence of the sub-spatial clustering method on initialization, so it easily falls into local optima. To overcome the above shortcomings, an improved particle swarm optimization algorithm was proposed by using the foraging behavior of birds, which is applied to the clustering algorithm and has achieved certain improvements [7]. The hybrid Fuzzy C-means clustering and gray wolf optimization (GWO) for image segmentation were proposed to overcome the shortcomings of Fuzzy C-means clustering [8]. A new collaborative multi-population differential evolution method with elite strategies was proposed to identify approximate optimal initial clustering prototypes, determine the optimal number of clusters in the data, and optimize the initial structure of FCM [9]. A hybrid optimization algorithm for simulated annealing and ant colony optimization was put forward to improve the clustering method and enhance the accuracy of data clustering [10]. The particle swarm algorithm was improved by using the Levy flight strategy and applied to the clustering algorithm, which improved its initialization process, greatly reduced the computational complexity, and improved the clustering accuracy [11]. The particle swarm optimization was combined with FCM, and the FCM fitness was optimized for each iteration [12]. Three improved particle swarm optimization algorithms were proposed and used differently according to data characteristics, which improved the clustering performance and weakened data sensitivity [13]. An improved artificial bee colony optimization algorithm was proposed to optimize the clustering problem and significantly improved the efficiency of data processing [14].
From the above research, the heuristic optimization algorithm plays a positive role in the clustering problem and overcomes a considerable part of the defects of the clustering algorithm itself, according to the different improvement methods. Tuna Swarm Optimization (TSO) is a novel algorithm with excellent searchability and excellent performance in various problems. An improved TSO was used to segment images of forest canopies [15]. An improved nonlinear tuna swarm optimization algorithm based on a Circle Chaos map and Levy flight operator (CLTSO) was proposed to optimize a BP neural network [16]. A hybrid model based on the long short-term memory (LSTM) and the TSO algorithm was established to predict wind speed [17]. TSO was blended with improved adaptive differential evolution with an optional external archive to form a new heuristic algorithm, which has been well demonstrated in the problem of photovoltaic parameter identification [18]. An enhanced tuna swarm optimization was proposed for performance optimization of FACTS devices [19]. A chaotic tuna swarm optimizer was proposed to find the optimal parameters of the Three-Diode Model, and it was hybridized with the Newton–Raphson method to improve the ability of convergence [20]. An improved particle swarm optimization algorithm can adaptively adjust the relevant parameters of the fuzzy decision tree, effectively improving the recognition accuracy in lithology recognition [21]. A multi-objective algorithm using a grid-based method was proposed to do privacy-preserving data mining and achieved excellent performance [22]. A multi-objective neural evolutionary algorithm based on decomposition and dominance (MONEADD) was proposed to evolve neural networks [23]. Ref. [24] proposed novel hybrid optimized clustering schemes with a genetic algorithm and PSO for segmentation and classification. An Improved Fruit Fly Optimization (IFFO) algorithm was proposed to minimize the makespan and costs of scheduling multiple workflows in the cloud computing environment [25]. An advanced encryption standard (AES) algorithm was used to deal with workflow scheduling [26]. Ref. [27] proposed an enhanced hybrid glowworm swarm optimization algorithm for traffic-aware vehicular networks, and technical delays were significantly reduced.
FCM dependence on initialization was not overcome in the above research. It is very easy to fall into local optima. In summary, to overcome the defects, an FCM based on multi-strategy tuna swarm optimization (MSTSO-FCM) is proposed. A chaotic local search strategy is introduced to improve the development ability of the algorithm, and the dominant population is fully utilized by an offset distribution estimation strategy to enhance the performance of the algorithm. Another contribution is that the search and development characteristics of the MSTSO algorithm are introduced into the fuzzy matrix of FCM, which overcomes the defects of poor global search ability and sensitive initialization. It not only uses the searchability of MSTSO but also retains the fuzzy mathematical ideas of FCM, to improve the clustering, stability, and accuracy of the FCM algorithm.
The rest of this paper is organized as follows. Section 2 introduces the existing related algorithms, which are the tuna swarm optimization algorithm and the Fuzzy C-means clustering algorithm. Section 3 proposes the Multi-Strategy Tuna Swarm Optimization algorithm, and MSTSO-FCM is proposed based on it. Section 4 carries out the simulation experiment and does the comparative analysis. Finally, the conclusion is described in Section 5.

2. Existing Related Algorithms

2.1. Tuna Swarm Optimization

As a top marine predator, the tuna is a social animal. They choose the corresponding predation strategy according to the object they forage. The first strategy is spiral foraging, that is, when tuna feed, they swim in a spiral to take their prey into shallow water, attack, and catch it. The second strategy is parabolic foraging, in which each tuna follows the previous individual to form a parabolic shape to surround its prey. Tuna successfully forage through the above two methods. TSO is modeled based on these natural foraging behaviors, and the algorithm follows the basic rules as follows.
(1) Population initialization: TSO starts the optimization process by randomly generating the initial swarm evenly to update the position.
X i n i = r a n d ( u b l b ) + l b , i = 1 , 2 , , N P
where X i n i is the initial population, and u b and l b are the upper and lower boundaries of the problem space.
(2-1) Spiral foraging. Schools of tuna chase their prey by forming a tight spiral, and in addition to chasing their prey, they also exchange information with each other. Each tuna follows the previous one, so information can be shared between neighboring tuna. Based on the above principles, the mathematical formulas of the spiral foraging strategy are as follows:
X i t + 1 = { α 1 ( X b e s t t + β | X b e s t t X i t | ) + α 2 X i t , i = 1 α 1 ( X b e s t t + β | X b e s t t X i t | ) + α 2 X i 1 t   , i = 2 , 3 , , N P
α 1 = a + ( 1 a ) t t max
α 2 = ( 1 a ) ( 1 a ) t t max
β = e b l cos ( 2 π b )
l = exp ( 3 cos ( ( t max + 1 t 1 ) ) π )
where X i t + 1 is the position of the i th individual in the t + 1 generation, X b e s t t is the current optimal individual, α 1 and α 2 are the coefficients that control the degree of follow-up of the individual towards the optimal individual and the previous individual in the chain, a is a constant which is 0.6, t and t max represent the current number of iterations and the maximum number of iterations, respectively, and b is a random number evenly distributed from 0 to 1. When the optimal individual cannot find food, blindly following the optimal individual to forage is not conducive to group foraging. Therefore, a random coordinate in the search space is generated to serve as a reference point for the spiral search. It enables each individual to explore in a wider space and enables TSO to explore globally. The specific mathematical model is described as follows:
X i t + 1 = { α 1 ( X r a n d t + β | X r a n d t X i t | ) + α 2 X i t , i = 1 α 1 ( X r a n d t + β | X r a n d t X i t | ) + α 2 X i 1 t , i = 2 , 3 , , N P
In particular, meta-heuristics typically perform extensive global exploration in the early stage, followed by a gradual transition to precise local development. Therefore, as the number of iterations increases, TSO changes the reference point for spiral foraging from random individuals to optimal individuals. In summary, the final mathematical model of the spiral foraging strategy is as follows:
X i t + 1 = { α 1 ( X r a n d t + β | X r a n d t X i t | ) + α 2 X i t , i = 1 α 1 ( X r a n d t + β | X r a n d t X i t | ) + α 2 X i 1 t , i = 2 , 3 , , N P , i f   r a n d < t t max α 1 ( X b e s t t + β | X b e s t t X i t | ) + α 2 X i t , i = 1 α 1 ( X b e s t t + β | X b e s t t X i t | ) + α 2 X i 1 t , i = 2 , 3 , , N P , i f   r a n d t t max
(2-2) Parabolic foraging. In addition to feeding through a spiral formation, tuna can also form a parabolic cooperative feeding formation. One method is that tuna feed in a parabolic shape using food as a reference point. Another is that tuna can forage for food by looking around. We assume that the selection probability of both is 50%, and both methods are executed at the same time. Tuna hunt synergistically through these two foraging strategies and then find their prey.
X i t + 1 = { X b e s t t + r a n d ( X b e s t t X i t ) + T F p 2 ( X b e s t t X i t ) , i f   r a n d   < 0 . 5 T F p 2 X i t , i f   r a n d   0 . 5
p = ( 1 t t max ) ( t t max )
In the formula, TF is a random number with a value of 1 or −1.
(3) Termination condition. The algorithm continuously updates and calculates all individual tuna until the final condition is met, and then returns the optimal individual and the corresponding fitness value.

2.2. Fuzzy C-Means Clustering Algorithm

The FCM algorithm [28] is a clustering algorithm based on division, which measures the clustering effect by dividing the similarity between objects. The similarity within the same cluster is high, but that between different clusters is low. So, the fuzzy mean is used for flexible division, which is mainly divided into the following steps:
Step 1: Initialize the membership matrix. The FCM algorithm blurs the concept of 0, 1 binary values to between 0~1, or the degree of membership which measures the relationship between each data point and the cluster center.
j = 1 k u i , j = 1 , i = 1 , 2 n u m
where k represents the cluster category, n u m represents the number of objects in the dataset, and u i , j represents the membership degree of different categories of each set of data, and its sum is 1. The larger the relative membership degree, the greater the probability of the category.
Step 2: Iterative termination judgment. The maximum number of iterations, m a x _ i t e r , is set to determine whether the current number of iterations has reached the upper limit of iterations. Output the clustering result if the current number of iterations exceeds m a x _ i t e r , or update the cluster center;
Step 3: Calculate the cluster center. After the FCM cluster center c j is established, the membership degree sum of all points to the cluster center is calculated first, and then the cluster center is updated by multiplying the proportion of the membership degree of each category by the original data. The calculation formula is below:
c j = i = 1 n u m ( u i , j k x i ) i = 1 n u m u i , j k = i = 1 n u m ( u i , j k i = 1 n u m u i , j k x i ) , j = 1 , 2 k
where c j represents the jth cluster center, and x i is the current data point.
Step 4: Update the membership matrix. Based on the cluster center and data points, the membership degree is updated through the membership calculation formula:
u i , j = 1 l = 1 k ( x i c j x i c l ) 2 k 1
As can be seen from the above equation, x i c j represents the Euclidean distance from the data point x i to the cluster center c j , and x i c l represents the distance from the data point x i to all cluster centers. The closer the data point is to c j , the greater the membership value.
Step 5: Judgment of iteration termination conditions. When the current number of iterations is less than the maximum number of iterations, the iteration termination conditions are as follows:
max { | u i , j i t e r u i t e r 1 | } ε
where u i , j i t e r indicates the degree of membership under the current number of iterations i t e r , and u i t e r 1 indicates the degree of membership before the update. When the difference between the two is less than the set threshold ε , it means that a better solution has been found; hence, the ending of iteration and the return of the fuzzy clustering result. If the termination condition is not met, return to step 3.

3. MSTSO-FCM

In this section, two improvement strategies, which are the Chaotic local search strategy and the Offset distribution estimation strategy, are introduced in detail, and the design scheme of MSTSO is described. On this basis, the MSTSO-FCM algorithm is proposed.

3.1. Chaotic Local Search Strategy

The chaotic local search strategy finds a better solution by searching the vicinity of each solution. Therefore, this strategy can effectively improve the exploitation ability of algorithms. Besides, chaos mapping has the characteristics of randomness and ergodicity, so the use of chaos mapping could further improve the effectiveness of the local search strategy. In MSTSO, the chaotic local search strategy is applied only to the dominant group of the population. The top half of individuals from the population with less fitness are selected to form the dominant group. The formula of the chaotic local search strategy is as follows:
X n e w , i t + 1 = X i t + (   C t 0.5 ) × (   X j t X k t   )
where X n e w , i t + 1 is a new solution generated using the chaotic local search strategy,   X j t and X k t are two different individuals randomly selected from the dominant population, and C t is the chaos value generated by the chaos mapping. In MSTSO, a tent chaos mapping is used to generate C t . Tent chaos mapping is a classical one, and, compared with logistic mapping, it has better traversal uniformity, which can improve the optimization speed of the algorithm, and, at the same time, produce a more evenly distributed initial value between [0, 1]. The tent mapping expression is as follows:
C i + 1 t = { 2 C i t , 0 C i t 0.5 2 ( 1 C i t ) , 0.5 < C i t 1

3.2. Offset Distribution Estimation Strategy

The distribution estimation strategy represents relationships between individuals through probabilistic models. This strategy uses the current dominant population to calculate the probability distribution model, generates new offspring populations based on the sampling of the probability distribution model, and finally obtains the optimal solution through continuous iteration. In this paper, the top half of the individuals with better performance are sampled and the strategy is used for the poor individuals. The mathematical model of the strategy is described below:
X n e w , i t + 1 = m + r a n d n ( m X i t )
m = ( X b e s t + X m e a n t + X i t ) / 3
C o v ( i ) = 1 N u m / 2 i = 1 N u m / 2 ( X i t + 1 X m e a n t ) × ( X i t X m e a n t ) T
X m e a n t = i = 1 N P / 2 ω i × X i t
ω i = ln ( N u m / 2 + 0.5 ) ln ( i ) i = 1 N u m / 2 ( ln ( N u m / 2 + 0.5 ) ln ( i ) )
where X m e a n t represents the weighted mean of the dominant population, N u m is the population size, and ω i represents the weight coefficient in the descending order of fitness value in the dominant population. Each weight coefficient ranges from 0 to 1, and the sum of the weight coefficient is 1. C o v is the weighted covariance matrix of the dominant population.

3.3. MSTSO Design

MSTSO adds the proposed Chaotic local search strategy and Offset distribution estimation strategy to the original TSO. The specific algorithm design is shown in Algorithm 1.
Algorithm 1: The procedure of MSTSO
Input:Fitness function f, Range [xmin, xmax], Fitness evaluation maximum times FEsmax.
Output:xbest.
//Initialization.
1.Initialize population X by using Equation (1).
2.Assign a = 0.6 and z = 0.03
3.Evaluate X to determine their fitness value by using f(X).
4.Initialize Fes using Fes = Fes+NP
//Main loop.
5.While FEs < FEsmax do
6.     Update a1, a2, p through Equations (3), (4) and (10);
7.     If rand < z
8.          Update X through Equation (1);
9.      else
10.          If rand < 0.5
11.               Update X through Equation (17);
12.           else
13.If t/tmax < rand
14.Update X through Equation (7);
15.else
16.Update X through Equation (2);
17.end
18.end
19.end
20.Update X through Equation (15);
21.Evaluate X to determine their fitness value by using f(X);
22.Updata xbest;
23.end

3.4. FCM Clustering Based on MSTSO Optimization

Due to the fuzzy theory, FCM has more flexible clustering results than traditional hard subspace clustering (mean clustering) [29]. The iterative process of FCM clustering can be understood as a central continuous moving process, which leads to a large improvement in global search ability. FCM does not overcome the dependence of the sub-spatial clustering method on initialization, so it is very easy to fall into local optima. It is difficult to effectively cluster when dealing with high-dimensional spatial data. Considering the above problems using the MSTSO algorithm to improve FCM, the search and development characteristics of the MSTSO algorithm are introduced into the fuzzy matrix of FCM to overcome the shortcomings of poor global searchability and sensitive initialization. It not only uses the searchability of MSTSO but also retains the fuzzy mathematical ideas of FCM, and improves the clustering, stability, and accuracy of the FCM algorithm.
To obtain a better clustering center for FCM, the MSTSO algorithm is used to optimize the membership matrix and replace the original random initialization and update method. It is necessary to know the number of categories of clustered data.
The process of MSTSO optimizing the FCM cluster is as follows: the flowchart is shown in Figure 1; the pseudo-code is shown in Algorithm 2.
Step 1: Use Equation (1) to randomly initialize the population and assign the parameters a and z, where dim = k n u m means the population dimension, with each dimension including all the elements in the membership matrix U , as shown in Figure 2:
Step 2: Introduce the FCM module to calculate the fitness value. Equation (11) is used to reconstruct the membership matrix into the data, in which the dimension of each population and the number of parameters of the membership matrix in the MSTSO algorithm are the same and Equation (12) is used to calculate the cluster center point, and finally the sum of the distances from all arrays to the cluster center is calculated, as shown in the following:
d i s t a n c e = l = 1 k x i c l
Step 3: Update the population position and calculate the fitness value. Update a 1 , a 2 , p . When random number r a n d < z , use Equation (1) to initially update the population position. When r a n d > z , make a finer division. Use Equation (13) to initially update the position, and vice versa. When t / t max < r a n d , use Equation (7) to conduct a preliminary update. When t / t max > r a n d , use Equation (2) to perform a preliminary update and finally use Equation (11) to again update the position information that enters the FCM model before calculating the fitness value.
Step 4: Determine whether the iteration is terminated. When the iteration is terminated, output the membership matrix U corresponding to the optimal fitness value at this time. As the optimal FCM model is constructed, if the iteration termination condition is not met, return to Step 2.
Algorithm 2: The procedure of MSTSO-FCM
Input:fFCM, [Xmin, Xmax], FEsmax.
Output:membership matrix U
1.Reshape the membership matrix U to population X using Equation (11);
2.Initialize population X by using Equation (1);
3.Assign a = 0.6 and z = 0.03
5.Evaluate X to determine their fitness value by using fFCM(X) (Equation (22)).
6.Update Fes using Fes = Fes + NPFes = Fes + NP
7.Execute MSTSO’s Main Loop;
8.Reshape the xbest. to U, and output.

4. Simulation Experiment and Comparative Analysis

4.1. MSTSO Algorithm Performance Verification

To illustrate the performance of the MSTSO algorithm proposed in this paper, the CEC2017 test set is selected for verification, including 1 unimodal function, 7 multimodal functions, 10 mixed functions, and 10 combined functions. All simulations in this paper are run on MATLAB R2016b software, and the experimental environment is an Intel(R) Core(TM) i7-8700 CPU, 16 GB memory computer. In this paper, the butterfly optimization algorithm (BOA), reptile search algorithm (RSA), arithmetic optimization algorithm (AOA), tunicate swarm algorithm (TSA), and sparrow search algorithm (SSA) are selected as comparison algorithms to illustrate the performance of the improved algorithm MSTSO. To ensure the fairness of the experiment, the population of all algorithms is 500, the maximum number of iterations is 600, and the parameter settings of the rest of the algorithms are based on the parameter values of the original paper. Each algorithm runs independently 51 times, and the statistical average results are shown in Appendix A.
From Appendix A, we can see that MSTSO performs well on most test functions. Specifically, for the unimodal test function F1, although MSTSO fails to stably obtain the optimal solution, the MSTSO optimization accuracy is significantly better than that of other algorithms, reaching nine orders of magnitude, illustrating that MSTSO has excellent local search ability. The multimodal functions F2–F8 are often used to test the global search ability of algorithms, which usually have multiple local minimums, so excellent algorithms can jump out of local minima to obtain global optimal values. MSTSO performs best on all seven of these functions, ranking behind RSA only on F8. For TSO, MSTSO performs better on all multimodal functions, indicating that MSTSO’s improvement strategy significantly improves TSO’s global search ability. F9–F18 and F19–F28 are mixed and combination functions, respectively. These two types of functions have complex structures better able to test the ability of algorithms to solve complex optimization problems. As can be seen from Appendix A, MSTSO ranked second only on F19. In the remaining mixed functions and combination functions, MSTSO is the best performer, achieving significantly better results than TSO, which shows that the improvement strategy proposed in this paper has well enhanced the ability of algorithms to solve complex structural problems. It has the potential to better solve complex optimization problems in the real world.
Appendix B illustrates the p-values calculated by the Wilcoxon signed rank test for each function of each algorithm. If the value is less than 0.05, it means that there is a significant difference between MSTSO and the other competitors; otherwise, there is no significant difference. It can be seen that MSTSO is significantly different from the other algorithms for most functions except for F20. In summary, MSTSO can strike a balance between the development and exploration capabilities of the algorithm and has a good local optimal avoidance ability, which can effectively avoid the premature convergence of the algorithm.

4.2. Cluster Dataset and Comparison Algorithm

To verify the clustering ability of MSTSO-FCM, two sets of artificial large datasets and four sets of UCI datasets are selected [30] and are available from https://archive.ics.uci.edu/datasets (accessed on 3 January 2024), of which two sets of artificial datasets contain three categories, with each category containing 5000 sets of data, respectively, two-dimensional characteristic data and three-dimensional characteristic data, which conformed to Gaussian distribution, and the basic information of the data is shown in Table 1.
To evaluate the clustering performance of each algorithm, three clustering indicators, Accuracy (Ac), Silhouette (Sil), Davies-Bouldin (DB), and Area Under Curve (AUC) are introduced, and the specific information of each index is shown in Table 2.
Four comparison algorithms are selected, namely the TSO-FCM algorithm, PSO-FCM algorithm [13], FCM algorithm, Gaussian mixed clustering (GMM) [31], and K-means clustering algorithm [32]. Table 3 shows the clustering algorithm parameters.

4.3. Evaluation of Clustering Results

To verify that MSTSO can achieve a better improvement effect on FCM clustering, in the above five datasets, the loss reduction curve of the traditional TSO algorithm is compared. The loss reduction curve is a curve in which the value of the loss function changes as the number of times it is learned increases. The comparison results are shown in Figure 3, in which the convergence ability of MSTSO in the early stage is not as good as that of TSO, but its development performance is better. After TSO stagnates, MSTSO can still search for a better development posture. Generally, after iterating to 8 times, MSTSO will stagnate. From the decline curve of the UCI public data set in the figure, it can be intuitively seen that the late convergence ability of MSTSO is stronger than that of traditional TSO, and it can better explore the global situation and obtain smaller loss values.
To further verify the clustering effect, AC, Sil, DB, and AUC are used for evaluation, in which AC indicates the clustering accuracy, Sil and DB represent the internal indicators of the cluster, and AUC is to evaluate the classification ability of positive and negative cases of the cluster. It can be seen from Table 4 that MSTSO-FCM has achieved the best score in the AC indicators of the six datasets. In addition, there is significant progress compared with TSO-FCM, PSO-FCM, and FCM. It has good clustering ability. MSTSO-FCM also has progressed in the Sil index. For artificial dataset 1, the Sil index of FCM is only 0.60, the Sil index of TSO-FCM is 0.74 compared with FCM, and the Sil value of PSO-FCM is 0.73. However, the Sil index of MSTSO-FCM reaches 0.84, showing that the discrimination between clusters is significantly improved on dataset 1. On dataset 2, the DB index of FCM is 1.39, while the DB index of TSO-FCM on dataset 2 is 0.87, the DB index of PSO-FCM is 0.85, and the DB index of MSTSO-FCM on dataset 2 is 0.4735. Compared with TSO-FCM and PSO-FCM, the intra-cluster compactness and inter-cluster separation have been improved, and the clustering effect is better. Regarding the AUC index, MSTSO-FCM achieved first in the four groups of UCI data sets, and had significant advantages compared with TSO-FCM and PSO-FCM, indicating that MSTSO-FCM has excellent classification ability.
To visualize the clustering effect, the two-dimensional features of the dataset Heart are selected for display. Each color in Figure 4 corresponds to a cluster, and the same color representation of different algorithms may not be the same, but it does not affect the results. It can be seen from the figure that MSTSO-FCM has achieved better clustering results than FCM, TSO-FCM, and PSO-FCM.
It can be seen from Table 4 that MTSO-FCM can achieve better classification results in a variety of different sample situations. Compared with FCM, MTSO-FCM overcomes the data sensitivity better and shows excellent classification performance. It can be seen from the AUC index that this algorithm has a better classification ability of positive and negative cases. In Figure 3, the MSTSO algorithm has a better global search ability. However, its computational complexity has not been significantly improved, and it is an improvement on the traditional hard clustering algorithm.

5. Conclusions

In this paper, a clustering algorithm of MSTSO-FCM is proposed, which first integrates the chaotic local search strategy and the shift distribution estimation strategy into the TSO algorithm, which improves the performance of basic TSO by enhancing the development ability and maintaining population diversity. Secondly, the search and development characteristics of the MSTSO algorithm are introduced into the fuzzy matrix of FCM, which overcomes the defects of poor global search ability and sensitive initialization. It not only uses the searchability of MSTSO but also retains the fuzzy mathematical ideas of FCM, to improve the clustering accuracy, stability, and accuracy of the FCM algorithm. In two groups of artificial datasets and four groups of UCI datasets, MSTSO-FCM achieves the first good results in the AC index and AUC index, indicating that it has excellent clustering and classification ability, and has significant improvement compared with TSO-FCM and PSO-FCM in Sil and DB two internal indicators; this shows that the intra-cluster compactness and inter-cluster separation have been improved. The simulation results show that the improved clustering algorithm has better convergence ability and clustering effect.
In the future research direction, the density clustering method will be studied, FCM will be combined with density clustering, and it is necessary to focus more on solving real-world engineering application problems.

Author Contributions

Conceptualization, C.S. and J.Z.; Data curation, C.S.; Formal analysis, Q.S.; Investigation, Z.Z.; Methodology, C.S.; Project administration, J.Z.; Resources, J.Z.; Software, C.S.; Supervision, J.Z.; Validation, C.S., Q.S. and Z.Z.; Visualization, Q.S.; Writing—original draft, C.S.; Writing—review & editing, C.S. and Q.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 42001336).

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors express their gratitude to the reviewers and editors for their valuable feedback and contributions to refining this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Test results of seven algorithms at CEC2017.
Table A1. Test results of seven algorithms at CEC2017.
AlgorithmIndicatorBOARSAAOATSASSATSOMSTSO
F1Mean3.82 × 1046.76 × 1036.91 × 1043.83 × 1048.40 × 1047.52 × 1042.67 × 10−6
Std6.97 × 1032.46 × 1031.15 × 1041.19 × 1046.59 × 1038.60 × 1039.20 × 10−7
Rank3254761
F2Mean9.33 × 1038.48 × 1017.61 × 1031.62 × 1031.44 × 1038.10 × 1035.90 × 101
Std1.29 × 1033.03 × 1012.45 × 1031.40 × 1031.09 × 1032.45 × 1031.51 × 100
Rank7254361
F3Mean3.49 × 1021.23 × 1022.95 × 1022.76 × 1023.50 × 1024.04 × 1023.42 × 101
Std2.16 × 1012.19 × 1013.20 × 1014.09 × 1014.40 × 1012.46 × 1011.25 × 101
Rank5243671
F4Mean6.63 × 1012.49 × 1016.21 × 1016.15 × 1018.06 × 1018.29 × 1016.44 × 10−3
Std5.76 × 1007.04 × 1006.71 × 1001.43 × 1018.84 × 1008.30 × 1008.25 × 10−3
Rank5243671
F5Mean5.57 × 1022.12 × 1026.00 × 1024.83 × 1027.12 × 1026.59 × 1026.07 × 101
Std3.17 × 1015.10 × 1015.66 × 1017.78 × 1016.85 × 1015.57 × 1011.48 × 101
Rank4253761
F6Mean2.93 × 1029.91 × 1012.25 × 1022.34 × 1022.72 × 1023.28 × 1023.64 × 101
Std1.54 × 1012.68 × 1012.67 × 1013.99 × 1014.31 × 1011.94 × 1011.30 × 101
Rank6234571
F7Mean6.82 × 1031.53 × 1034.50 × 1038.57 × 1039.35 × 1038.55 × 1031.56 × 10−1
Std8.69 × 1026.00 × 1027.24 × 1023.01 × 1031.85 × 1039.42 × 1022.74 × 10−1
Rank4236751
F8Mean7.33 × 1033.76 × 1035.51 × 1035.55 × 1037.05 × 1037.22 × 1034.84 × 103
Std2.85 × 1026.30 × 1025.83 × 1026.07 × 1027.45 × 1023.98 × 1027.52 × 102
Rank7134562
F9Mean2.19 × 1031.00 × 1021.72 × 1032.23 × 1033.91 × 1037.28 × 1032.42 × 101
Std6.72 × 1024.16 × 1019.74 × 1021.69 × 1031.64 × 1031.81 × 1032.31 × 101
Rank4235671
F10Mean2.08 × 1098.30 × 1046.27 × 1098.88 × 1084.69 × 1081.32E+103.09 × 102
Std7.43 × 1087.32 × 1042.56 × 1091.07 × 1093.76 × 1082.71 × 1092.15 × 102
Rank5264371
F11Mean3.15 × 1081.01 × 1043.80 × 1041.75 × 1088.55 × 1078.92 × 1094.96 × 101
Std2.10 × 1081.20 × 1041.71 × 1044.14 × 1084.66 × 1083.93 × 1091.67 × 101
Rank6235471
F12Mean1.19 × 1052.45 × 1035.72 × 1043.73 × 1051.50 × 1065.68 × 1063.23 × 101
Std7.62 × 1042.63 × 1034.92 × 1046.73 × 1051.21 × 1064.39 × 1068.21 × 100
Rank4235671
F13Mean1.82 × 1067.17 × 1032.35 × 1042.48 × 1071.83 × 1075.67 × 1082.41 × 101
Std1.46 × 1067.41 × 1031.22 × 1047.80 × 1072.37 × 1073.47 × 1084.53 × 100
Rank4236571
F14Mean3.18 × 1039.68 × 1021.98 × 1031.43 × 1032.74 × 1033.90 × 1034.12 × 102
Std4.12 × 1023.29 × 1025.09 × 1022.92 × 1025.38 × 1026.13 × 1022.62 × 102
Rank6243571
F15Mean1.22 × 1034.21 × 1029.12 × 1026.06 × 1021.20 × 1032.69 × 1031.04 × 102
Std2.49 × 1021.77 × 1022.67 × 1022.30 × 1023.85 × 1021.14 × 1034.28 × 101
Rank6243571
F16Mean9.60 × 1051.21 × 1051.29 × 1062.08 × 1061.51 × 1073.51 × 1073.02 × 101
Std6.22 × 1051.09 × 1051.60 × 1064.09 × 1061.51 × 1072.38 × 1072.15 × 100
Rank3245671
F17Mean4.61 × 1068.43 × 1031.08 × 1061.11 × 1074.23 × 1076.66 × 1082.17 × 101
Std4.06 × 1069.45 × 1031.39 × 1053.45 × 1071.23 × 1083.75 × 1083.32 × 100
Rank4235671
F18Mean7.29 × 1023.97 × 1026.94 × 1027.24 × 1028.59 × 1029.81 × 1021.76 × 102
Std9.88 × 1011.32 × 1021.54 × 1022.09 × 1022.42 × 1021.38 × 1026.63 × 101
Rank5234671
F19Mean1.97 × 1023.03 × 1024.87 × 1024.68 × 1025.06 × 1026.05 × 1022.35 × 102
Std3.01 × 1012.44 × 1015.23 × 1014.96 × 1015.36 × 1014.53 × 1011.51 × 101
Rank1354672
F20Mean4.71 × 1021.01 × 1025.13 × 1034.47 × 1034.18 × 1036.15 × 1031.00 × 102
Std7.76 × 1012.02 × 1001.21 × 1032.09 × 1031.88 × 1031.31 × 1035.68 × 10−1
Rank3265471
F21Mean6.97 × 1024.97 × 1029.68 × 1027.86 × 1028.60 × 1021.13 × 1033.86 × 102
Std5.59 × 1014.04 × 1019.10 × 1018.15 × 1011.00 × 1021.10 × 1021.59 × 101
Rank3264571
F22Mean1.10 × 1035.59 × 1021.14 × 1038.47 × 1028.99 × 1021.20 × 1034.57 × 102
Std1.68 × 1025.59 × 1011.09 × 1028.08 × 1011.37 × 1021.92 × 1021.41 × 101
Rank5263471
F23Mean1.75 × 1033.93 × 1021.67 × 1037.61 × 1027.84 × 1022.54 × 1033.87 × 102
Std2.01 × 1021.44 × 1014.55 × 1023.02 × 1021.30 × 1025.87 × 1026.52 × 10−2
Rank6253471
F24Mean5.21 × 1032.34 × 1036.40 × 1035.01 × 1036.30 × 1038.12 × 1031.32 × 103
Std1.49 × 1039.05 × 1027.22 × 1028.76 × 1021.11 × 1036.89 × 1021.46 × 102
Rank4263571
F25Mean8.14 × 1025.59 × 1021.34 × 1037.30 × 1029.56 × 1021.49 × 1035.00 × 102
Std9.81 × 1014.78 × 1012.14 × 1029.92 × 1011.65 × 1023.99 × 1028.10 × 100
Rank4263571
F26Mean3.28 × 1033.71 × 1022.95 × 1031.27 × 1031.06 × 1033.57 × 1033.53 × 102
Std3.99 × 1025.38 × 1016.15 × 1024.52 × 1023.22 × 1027.74 × 1026.03 × 101
Rank6254371
F27Mean3.04 × 1031.01 × 1032.43 × 1031.58 × 1032.64 × 1034.07 × 1035.42 × 102
Std4.72 × 1022.69 × 1025.22 × 1024.08 × 1026.35 × 1021.11 × 1037.38 × 101
Rank6243571
F28Mean3.98 × 1076.41 × 1031.47 × 1071.33 × 1074.85 × 1071.58 × 1092.00 × 103
Std2.31 × 1073.26 × 1031.01 × 1071.07 × 1073.75 × 1075.48 × 1088.39 × 101
Rank5243671

Appendix B

Table A2. Wilcoxon signed rank results on CEC2017.
Table A2. Wilcoxon signed rank results on CEC2017.
FunctionBOARSAAOATSASSATSO
F15.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F25.15 × 10−102.31 × 10−65.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F35.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F45.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F55.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F65.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F75.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F85.15 × 10−106.03 × 10−81.62 × 10−55.23 × 10−65.15 × 10−105.15 × 10−10
F95.15 × 10−108.27 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F115.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F125.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F135.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F145.15 × 10−101.02 × 10−85.46 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F155.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F165.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F175.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F185.15 × 10−109.87 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F194.40 × 10−85.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F205.15 × 10−103.68 × 10−15.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F215.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F225.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F235.15 × 10−102.76 × 10−25.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F245.15 × 10−103.96 × 10−75.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F255.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F265.15 × 10−103.92 × 10−25.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F275.15 × 10−105.80 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10
F285.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−105.15 × 10−10

References

  1. Atluri, G.; Karpatne, A.; Kumar, V. Spatio-temporal data mining: A survey of problems and methods. Acm Comput. Surv. 2018, 51, 1–41. [Google Scholar] [CrossRef]
  2. Jia, H.M.; Jiang, Z.C.; Li, Y. Simultaneous feature selection optimization based on improved bald eagle search algorithm. Control Decis. 2022, 37, 445–454. [Google Scholar]
  3. Banerjee, D. Recent progress on cluster and meron algorithms for strongly correlated systems. Indian J. Phys. 2021, 95, 1669–1680. [Google Scholar] [CrossRef]
  4. Mohapatra, S.K.; Sahu, P.; Almotiri, J.; Alroobaea, R.; Rubaiee, S.; Bin Mahfouz, A.; Senthilkumar, A.P. Segmentation and classification of encephalon tumor by applying improved fast and robust FCM Algorithm with PSO-based ELM Technique. Comput. Intell. Neurosci. 2022, 2002, 1–9. [Google Scholar] [CrossRef]
  5. Mehran, M.; Ali, K.; Hamed, H. Clustering-based reliability assessment of smart grids by fuzzy c-means algorithm considering direct cyber–physical interdependencies and system uncertainties. Sustain. Energy Grids Netw. 2002, 31, 100757. [Google Scholar]
  6. Yang, Q. An FCM clustering algorithm based on the identification of accounting statement whitewashing behavior in universities. J. Intell. Syst. 2022, 31, 345–355. [Google Scholar] [CrossRef]
  7. Poli, R.; Kennedy, J.; Blackwell, T. Particle swarm optimization. Swarm Intell. 2007, 1, 33–57. [Google Scholar] [CrossRef]
  8. Maryam, M.; Reza, S.A.; Arash, D. Optimization of fuzzy c-means (FCM) clustering in cytology image segmentation using the gray wolf algorithm. BMC Mol. Cell Biol. 2022, 23, 9. [Google Scholar]
  9. Amit, B.; Issam, A. A novel adaptive FCM with cooperative multi-population differential evolution optimization. Algorithms 2022, 15, 380. [Google Scholar]
  10. Niknam, T.; Olamaei, J.; Amiri, B. A Hybrid Evolutionary Algorithm Based on ACO and SA for Cluster Analysis. J. Appl. Sci. 2008, 8, 2695–2702. [Google Scholar] [CrossRef]
  11. Gao, H.; Li, Y.; Kabalyants, P.; Xu, H.; Martinez-Bejar, R. A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight. IEEE Access 2020, 8, 122848–122863. [Google Scholar] [CrossRef]
  12. Izakian, H.; Abraham, A. Fuzzy C-means and fuzzy swarm for fuzzy clustering problem. Expert Syst. Appl. 2011, 38, 1835–1838. [Google Scholar] [CrossRef]
  13. Qian, Z.; Cao, Y.; Sun, X.; Ni, L.; Wang, Z.; Chen, X. Clustering optimization for triple-frequency combined obser-vations of BDS-3 based on improved PSO-FCM algorithm. Remote Sens. 2022, 14, 3713. [Google Scholar] [CrossRef]
  14. Celal, O.; Emrah, H.; Dervis, K. Dynamic clustering with improved binary artificial bee colony algorithm. Appl. Soft Comput. 2015, 28, 69–80. [Google Scholar]
  15. Wang, J.; Zhu, L.; Wu, B.; Ryspayev, A. Forestry Canopy Image Segmentation Based on Improved Tuna Swarm Optimization. Forests 2022, 13, 1746. [Google Scholar] [CrossRef]
  16. Wang, W.; Tian, J. An Improved Nonlinear Tuna Swarm Optimization Algorithm Based on Circle Chaos Map and Levy Flight Operator. Electronics 2022, 11, 3678. [Google Scholar] [CrossRef]
  17. Tuerxun, W.; Xu, C.; Guo, H.; Guo, L.; Zeng, N.; Cheng, Z. An ultra-short-term wind speed prediction model using LSTM based on modified tuna swarm optimization and successive variational mode decomposition. Energy Sci. Eng. 2022, 10, 3001–3022. [Google Scholar] [CrossRef]
  18. Tan, M.; Li, Y.; Ding, D.; Zhou, R.; Huang, C. An Improved JADE Hybridizing with Tuna Swarm Optimization for Numerical Optimization Problems. Math. Probl. Eng. 2022, 2022, 1–17. [Google Scholar] [CrossRef]
  19. Awad, A.; Kamel, S.; Hassan, M.H.; Elnaggar, M.F. An Enhanced Tuna Swarm Algorithm for Optimizing FACTS and Wind Turbine Allocation in Power Systems. Electr. Power Compon. Syst. 2023. [Google Scholar] [CrossRef]
  20. Kumar, C.; Mary, D.M. A novel chaotic-driven Tuna Swarm Optimizer with Newton-Raphson method for parameter identification of three-diode equivalent circuit model of solar photovoltaic cells/modules. Optik 2022, 264, 169379. [Google Scholar] [CrossRef]
  21. Ren, Q.; Zhang, H.; Zhang, D.; Zhao, X. Lithology identification using principal component analysis and particle swarm optimization fuzzy decision tree. J. Pet. Sci. Eng. 2023, 220, 111233. [Google Scholar] [CrossRef]
  22. Wu, T.-Y.; Lin, J.C.-W.; Zhang, Y.; Chen, C.-H. A Grid-Based Swarm Intelligence Algorithm for Privacy-Preserving Data Mining. Appl. Sci. 2019, 9, 774. [Google Scholar] [CrossRef]
  23. Shao, Y.; Lin, J.C.-W.; Srivastava, G.; Guo, D.; Zhang, H.; Yi, H.; Jolfaei, A. Multi-Objective Neural Evolutionary Algorithm for Combinatorial Optimization Problems. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 2133–2143. [Google Scholar] [CrossRef]
  24. Kubicek, J.; Varysova, A.; Cerny, M.; Skandera, J.; Oczka, D.; Augustynek, M.; Penhaker, M. Novel Hybrid Optimized Clustering Schemes with Genetic Algorithm and PSO for Segmentation and Classification of Articular Cartilage Loss from MR Images. Mathematics 2023, 11, 1027. [Google Scholar] [CrossRef]
  25. Aggarwal, A.; Dimri, P.; Agarwal, A.; Verma, M.; Alhumyani, H.A.; Masud, M. IFFO: An Improved Fruit Fly Optimization Algorithm for Multiple Workflow Scheduling Minimizing Cost and Makespan in Cloud Computing Environments. Math. Probl. Eng. 2021, 2021, 1–9. [Google Scholar] [CrossRef]
  26. Aggarwal, A.; Kumar, S.; Bhatt, A.; Shah, M.A. Solving User Priority in Cloud Computing Using Enhanced Optimization Algorithm in Workflow Scheduling. Comput. Intell. Neurosci. 2022, 2022, 1–11. [Google Scholar] [CrossRef] [PubMed]
  27. Upadhyay, P.; Marriboina, V.; Kumar, S.; Kumar, S.; Shah, M.A. An Enhanced Hybrid Glowworm Swarm Optimization Algorithm for Traffic-Aware Vehicular Networks. IEEE Access 2022, 10, 110136–110148. [Google Scholar] [CrossRef]
  28. Balaji, P.; Muniasamy, V.; Bilfaqih, S.M. Chimp Optimization Algorithm Influenced Type-2 Intuitionistic Fuzzy C-Means Clustering-Based Breast Cancer Detection System. Cancers 2023, 15, 1131. [Google Scholar] [CrossRef] [PubMed]
  29. Usman, Q. A dissimilarity measure based fuzzy c-means (FCM) clustering algorithm. J. Intell. Fuzzy 2014, 26, 229–238. [Google Scholar]
  30. Tanveer, M.; Gautam, C.; Suganthan, P. Comprehensive evaluation of twin SVM based classifiers on UCI datasets. Appl. Soft Comput. 2019, 83, 105617. [Google Scholar] [CrossRef]
  31. Ma, Y.; Hao, Y. Antenna Classification Using Gaussian Mixture Models (GMM) and Machine Learning. IEEE Open J. Antennas Propag. 2020, 1, 320–328. [Google Scholar] [CrossRef]
  32. Mao, B.; Li, B. Building façade semantic segmentation based on K-means classification and graph analysis. Arab. J. Geosci. 2019, 12, 1–9. [Google Scholar] [CrossRef]
Figure 1. MSTSO optimizes the FCM cluster.
Figure 1. MSTSO optimizes the FCM cluster.
Mathematics 12 00453 g001
Figure 2. Membership matrix U reshaped to the population in MSTSO.
Figure 2. Membership matrix U reshaped to the population in MSTSO.
Mathematics 12 00453 g002
Figure 3. MSTSO optimization and TSO optimization loss reduction curve comparison.
Figure 3. MSTSO optimization and TSO optimization loss reduction curve comparison.
Mathematics 12 00453 g003
Figure 4. Heart dataset two-dimensional feature clustering effect visualization.
Figure 4. Heart dataset two-dimensional feature clustering effect visualization.
Mathematics 12 00453 g004
Table 1. Information about the dataset.
Table 1. Information about the dataset.
Dataset NameNumber of Data ObjectsCharacteristic DimensionsCategorized Categories
Artificial data set115,00023
Artificial data set115,00033
UCI-iris15042
UCI-liver34562
UCI-heart303132
UCI-pima76882
UCI-waveform5000213
Table 2. Cluster indicators.
Table 2. Cluster indicators.
IndicatorsEquationsSpecific Information
Ac A = N r N × 100 N r is the correct sample classification number, N is the total number of samples. The larger the value of A, the better the clustering.
Sil S I = 1 N i = 1 N b i a i max ( b i , a i ) S I [ 1 , 1 ] The larger the value S I , the better the clustering effect.
a i denotes the mean of the distance from the sample i to each sample in the set.
b i denotes the mean of the distance from sample i to each sample in the nearest set.
DB D B = 1 k j = 1 k max ( s ( c j ) + s ( c r ) d i s t ( c j , c r ) ) The ratio of the sum of scatters within a cluster between cluster separations.
s ( c j ) denotes the mean distance from the center point c j of the set for all points in the set.
s ( c r ) denotes the mean distance from the center point c r of the set for all points in the set.
AUC A U C = ins p r a n k i n s M × ( M + 1 ) 2 M × N M , N represents the number of positive and negative samples, ins p represents the adding of the positive sample numbers, r a n k i n s represents the i-th Sample.
Table 3. Comparison algorithm settings.
Table 3. Comparison algorithm settings.
NumberAlgorithmParameter Settings
1MSTSO-FCMThe index of membership matrix M = 4 , Maximum iterations I = 100 , Termination condition E = 1 e 6   z = 0.05 , a = 0.7 .
2TSO-FCM M = 4 , I = 100 , E = 1 e 6   z = 0.05 , a = 0.7 .
3PSO-FCM M = 4 , I = 100 , E = 1 e 6 , C 1 = C 2 = 1.4 .
4FCM M = 4 , I = 100 , E = 1 e 6 .
5GMMNon-negative regularized number n n = 1 e 5 .
6K-MEANSThe number of clusters varies with the dataset.
Table 4. Comparison of clustering performance results.
Table 4. Comparison of clustering performance results.
AlgorithmDatingACSilDBAUCAlgorithmDatingACSilDBAUC
MSTSO-FCMset197.13%0.840.150.96FCMset189.97%0.600.450.88
set290.04%0.610.470.90data 75.19%0.261.390.74
iris98.67%0.500.640.97iris90.67%0.540.590.89
liver63.48%0.232.480.62liver50.14%0.511.170.51
heart89.44%0.132.490.90heart74.59%0.191.930.72
pima69.12%0.192.340.65pima58.18%0.322.720.55
waveform77.62%0.580.750.75waveform65.84%0.510.890.64
PSO-FCMset192.7%0.730.260.91GMMset189.71%0.120.770.86
set277.45%0.390.850.75set273.26%0.211.460.72
iris92.67%0.460.590.91iris96.67%0.500.650.95
liver57.97%0.341.900.55liver50.14%0.501.070.51
heart79.54%0.212.390.77heart54.78%0.182.120.53
pima65.17%0.252.190.62pima61.93%0.112.200.62
waveform70.37%0.520.820.69waveform70.62%0.531.210.68
TSO-FCMset194.8%0.740.250.93K-Meansset180.54%0.360.970.80
data 79.36%0.380.870.77set270.12%0.181.590.68
iris94.67%0.510.640.93iris89.33%0.550.580.88
liver56.81%0.331.920.59liver46.00%0.630.770.48
heart80.86%0.152.250.78heart72.61%0.211.820.71
pima66.26%0.232.510.63pima66.92%0.171.780.67
waveform72.42%0.560.790.61waveform60.12%0.351.980.89
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, C.; Shao, Q.; Zhou, Z.; Zhang, J. An Enhanced FCM Clustering Method Based on Multi-Strategy Tuna Swarm Optimization. Mathematics 2024, 12, 453. https://doi.org/10.3390/math12030453

AMA Style

Sun C, Shao Q, Zhou Z, Zhang J. An Enhanced FCM Clustering Method Based on Multi-Strategy Tuna Swarm Optimization. Mathematics. 2024; 12(3):453. https://doi.org/10.3390/math12030453

Chicago/Turabian Style

Sun, Changkang, Qinglong Shao, Ziqi Zhou, and Junxiao Zhang. 2024. "An Enhanced FCM Clustering Method Based on Multi-Strategy Tuna Swarm Optimization" Mathematics 12, no. 3: 453. https://doi.org/10.3390/math12030453

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop