1. Introduction
As an unsupervised technique in data mining and machine learning, cluster analysis is widely used in various areas such as attribute reduction [
1,
2,
3,
4], feature selection [
5,
6,
7], image processing [
8,
9], information granulation [
10,
11,
12], and graph convolutional neural networks [
13,
14,
15]. The primary objective of clustering is to organize heterogeneous data into meaningful groups based on their similarities, revealing the inherent structures and patterns within the dataset. To achieve this, various clustering algorithms [
16] have been developed. However, it has been accepted that a single clustering algorithm cannot handle all types of data distribution effectively. Different algorithms or different parameters for an algorithm may lead to different clustering results. To enhance the robustness and stability of clustering algorithms, researchers have proposed ensemble clustering methods. In comparison to single clustering methods, ensemble clustering methods [
17,
18,
19,
20,
21,
22] integrate results from multiple foundational clustering algorithms, yielding more stable, robust, and accurate clustering solutions. Nevertheless, existing ensemble clustering methods typically adopt a hard clustering strategy, where an element can belong to only one cluster or none, and clear boundaries exist between different clusters. However, in situations with insufficient information on data samples, hard clustering algorithms often lead to higher decision risks.
To address this issue, three-way decision theory [
23,
24] was introduced to describe uncertainties in information. This method divides the sample universe into three mutually exclusive regions and adopts different decision strategies for each region [
25,
26]. The three-way decision framework can be integrated with various computational models for learning uncertainty, such as rough set theory [
27,
28,
29], Bayesian networks [
30,
31], and fuzzy particle swarm optimization [
32,
33]. Inspired by the idea of three-way decision, Yu [
34] presented the framework of three-way clustering by using core and the fringe regions to character a cluster. These two sets partition the sample space into three parts, which capture three kinds of relationships between objects and a cluster, namely, belonging to, partially belonging to, and not belonging to [
35,
36,
37,
38].
Recently, three-way clustering [
39] has garnered widespread research interest, leading to the development of various three-way clustering algorithms within this theoretical framework. Wang and Yao [
40] proposed a three-way clustering framework called CE3, derived from mathematical morphology’s erosion and dilation concepts. Li et al. [
41] introduced sample’s stability to identify and establish relationships in ensemble clustering. Yu et al. [
42] proposed an efficient three-way clustering algorithm based on the idea of universal gravitation. Jia et al. [
43] developed an automatic three-way clustering approach by combining the proposed threshold selection and the cluster number selection method. Wang et al. [
44] proposed a three-way adaptive density peak clustering (3W-ADPC) method by integrating natural nearest neighbors with DPC.
Most of the existing three-way clustering algorithms are based on the original dataset, which is not suitable for high-dimensional datasets. The processing of high-dimensional data poses a fundamental yet highly challenging problem in the current field of data science. The purpose of dimensionality reduction is to decrease the data’s dimensionality while retaining the most significant aspects of its characteristics. By reducing the data’s dimensionality, we can simplify the complexity of data analysis, enhance model training speed, reduce storage requirements, and facilitate a clearer understanding and interpretation of the model’s results. Various dimensionality reduction techniques are commonly employed to address this challenge, including Principal Component Analysis (PCA) [
45,
46,
47], spectral clustering [
48,
49], factor analysis [
50], and multidimensional scaling [
51].
By integrating dimensionality reduction into three-way clustering, this paper presents an ensemble three-way clustering algorithm based on dimensionality reduction. The proposed method uses dimensionality reduction techniques to reduce data dimensions and eliminate noise. Based on the reduced dataset, random sampling and feature extraction are performed multiple times to introduce randomness and diversity, enhancing the algorithm’s robustness. Ensemble strategies are applied on these subsets, and the k-means algorithm is utilized to obtain multiple clustering results. Based on these results, the frequency of different data points being assigned to the same cluster is calculated to derive the co-occurrence frequency. If the co-occurrence frequency between data points exceeds a certain threshold, they are defined as similar classes. Finally, a three-way clustering approach was introduced by using the proposed similar relations. The main contributions of this research are as follows:
- (1)
Ensemble three-way clustering framework based on dimensionality reduction.
We introduce a novel ensemble three-way clustering framework that combines dimensionality reduction techniques with clustering ensemble methods. This framework reduces data dimensions, eliminates noise, and enhances clustering stability. By leveraging multiple clustering results, the method enhances the algorithm’s robustness through randomness and diversity.
- (2)
Integration of co-occurrence frequency, hierarchical clustering, and lifecycle analysis:
The proposed method calculates the co-occurrence frequency of data points being in the same cluster, aiding in accurately defining similar classes. It employs a single-linkage hierarchical clustering approach to fuse clustering results and constructs a dendrogram based on these probabilities. By analyzing the lifecycle of clusters, we determine the most stable clustering result, ensuring robustness and consistency.
These contributions collectively enhance the performance and applicability of three-way clustering algorithms, especially for high-dimensional datasets, providing a more accurate and stable clustering solution.
The remainder of this paper is organized as follows. In
Section 2, we provide a comprehensive review of the concepts related to three-way clustering, the k-means algorithm, PCA, and data integration strategies.
Section 3 outlines the methodology and algorithmic process employed in this study. The results and performance metrics obtained from the proposed algorithm on the UCI dataset are presented in
Section 4.
Section 5 encompasses the discussion of our findings and identifies areas for future improvement.
3. Similarity-Based Three-Way Clustering by Using Dimensionality Reduction
In this section, we propose a similarity theory [
43,
59] based on data dimensionality reduction and similarity-based three-way clustering. In contrast to traditional algorithms, our approach first employs the PCA algorithm for data preprocessing, transforming high-dimensional data into low-dimensional data. It incorporates an ensemble strategy by randomly extracting subsets of features from the samples in multiple iterations, generating diverse basic clustering results using the traditional k-means clustering algorithm. Subsequently, we calculate the co-association frequency between samples to derive similarity classes. By extracting only partial features of the samples, we significantly reduce the computational complexity compared to the existing traditional ensemble clustering methods. The algorithm proposed in this paper involves three main steps: the generation of basic clusters by using dimensionality-reduced data, the computation of co-association frequency and similarity classes, and the integration of these results into three-way clustering.
3.1. Dimensionality Reduction
In this study, we employed data dimensionality reduction techniques, specifically utilizing Principal Component Analysis (PCA) to reduce the dimensions of the data. PCA is a commonly used dimensionality reduction method, aiming to map the original data onto a lower-dimensional subspace while retaining the maximum variance in the data. Through PCA, we can transform high-dimensional data into lower-dimensional space, thereby enhancing our understanding of the intrinsic structure of the data.
To begin with, consider a dataset comprising samples and features, represented by matrix , where each row corresponds to a sample, and each column represents a feature. Our objective is to project this -dimensional dataset onto a -dimensional subspace (where ) and obtain a new feature matrix . The specific steps of dimensionality reduction by using PCA are as follows:
Step 1: Data normalization: The first step involves centralizing the original data by subtracting the mean of each feature, resulting in the centered matrix .
Step 2: Covariance Matrix Computation: The covariance matrix represents the correlations between data features, with the specific formula
Step 3: Eigenvalue and Eigenvector Computation: Eigenvalue decomposition is applied to the covariance matrix
, yielding eigenvalues
and their corresponding eigenvectors
.
Step 4: Selection of Top Eigenvectors: The eigenvectors corresponding to the top largest eigenvalues are chosen, forming the projection matrix .
Step 5: Data Projection: The centered original data matrix
is projected onto the selected
-dimensional subspace, resulting in the reduced feature matrix
, where each row represents a sample, and each column represents a reduced feature. The specific formula is
Through the aforementioned steps, we obtain the reduced-dimensional data matrix. In this low-dimensional space, we conduct fundamental clustering operations. This data-driven foundational clustering method allows for clustering analysis in lower dimensions while preserving the primary features of the data. The key advantage of this approach lies in its ability to facilitate data visualization, reduce computational complexity, and enhance clustering effectiveness through dimensionality reduction.
Next, we randomly select parts of the sample’s features to obtain different clustering results. For a multidimensional dataset, different subsets of features try to describe the dataset from different views. Thus, a set of diverse clustering results will be obtained when distinguishing subsets of features are employed. Suppose that we randomly extract parts of the features and apply the k-means clustering method to divide the dataset into k clusters. This process is repeated L times, yielding multiple clustering results . The process of foundational clustering based on data dimensionality reduction is outlined in Algorithm 1.
Algorithm 1: Foundational Clustering Based on Data Dimensionality Reduction |
|
3.2. Clustering Ensemble
From multiple clustering iterations, we obtain basic clustering results . Subsequently, we present a method for integrating the basic clustering results by using the co-occurrence frequency matrix. The aim is to employ the single-link method of hierarchical clustering to generate a more robust clustering result.
For a dataset with samples and are family clustering results of , we can construct an co-association frequency matrix , whose elements represents the frequency that two samples and are simultaneously assigned to the same cluster.
We view as the similarity between samples and utilize the single-linkage of hierarchical clustering to obtain an ensemble clustering result. In the process of clustering, each data sample is treated as an independent cluster, and then gradually the most similar cluster is merged based on their co-association frequencies. Clusters with the highest similarity are merged to form a new cluster node. This process iterates until the cluster result with the highest lifetime is chosen as the final merged result.
The schematic representation of the single-linkage clustering dendrogram is illustrated in
Figure 4. Different colors in
Figure 4 represent different clusters at present, and each color represents a set of samples with high similarity. This bottom-up merging strategy ensures that we fully consider the degree of association between samples, resulting in more accurate clustering results. By measuring the similarity between different clusters and visualizing them as a dendrogram, we could intuitively observe the structure and hierarchy of the clustering results. In the dendrogram, higher connecting points represented stronger associations between clusters with higher co-occurrence frequencies. These results were relatively stable and less susceptible to noise or changes in the data. Therefore, such clustering results were more reliable and better able to reflect the true structure and patterns of the data.
By constructing a single-linkage clustering dendrogram using co-association frequencies and selecting the clustering result with the highest lifetime as the final fusion result, we obtain more stable clustering results, thereby enhancing our understanding of the features and inherent structure of the dataset. The process of ensemble clustering is outlined in Algorithm 2.
Algorithm 2: Ensemble Clustering Results |
| Input: Reduced data matrix |
| Output: |
1 | Compute the co-occurrence frequency matrix by (2). |
2 | Obtain the single-linkage dendrogram of . |
3 | Achieve ensemble clustering results with the highest lifetime. |
4 | Return ensemble clustering results . |
3.3. Similar Classes Based on Co-Association Frequency
This section introduces three-way clustering models based on the co-occurrence frequency derived from clustering ensemble, proposing a similarity relationship under the framework of co-association frequency. Firstly, we give the definition of similar relation between and .
Definition 1. For a dataset with samples and are family clustering results of , is the co-association frequency between samples and . The similarity relation based on a threshold is defined as:where is a pre-defined parameter. For , the similar class is computed by: We still use Figure 3 as an example. If we take , then , , , , , and . From the above definition, we can find that the similar class
has the following properties:
Clearly, the set of similar classes
forms a covering of dataset
. For any subset
,
, the lower and upper approximations based on the co-association frequency are defined as follows:
Furthermore, we can use the positive region
and the fringe region
to describe the objective subset
. So, we define
and
, of
as
Usually, the positive region
contains the samples that belong to
definitely, and the fringe region
contains the samples that belong to
possibly. Based on the definitions and properties of
and
, for any cluster
, it is straightforward to obtain the core region
and the fringe region
by
Algorithm 3 illustrates the calculation of core region and the fringe region based on co-association frequency.
Algorithm 3: Finding core region and fringe region |
|
3.4. Similarity-Based Three-Way Clustering by Using Dimensionality Reduction
The stepwise execution of Algorithms 1–3 forms the framework of the proposed similarity-based three-way clustering by using dimensionality reduction, as illustrated in Algorithm 4.
Algorithm 4: Similarity-based three-way clustering algorithm |
Input: Original data matrix , Number of iteratiber of clusters Dimensionality reduction method (PCA), Threshold |
Output: |
- 1
Initialize: ←Algorithm 1; Return ; - 2
Ensemble: ←Algorithm 2; Return ; - 3
Identify Core and Fringe Regions: ←Algorithm 3; - 4
Return .
|
In this framework, we first generate a set of base clustering results by employing dimensionality reduction techniques (Algorithm 1). Subsequently, by calculating co-association frequencies, we utilize the single-linkage of hierarchical clustering to obtain ensemble clustering results (Algorithm 2). Finally, by defining the similar classes of each sample, we derive the core and fringe regions, further adjusting the clustering structure to yield more accurate and representative three-way clustering outcomes.
The uniqueness of this framework lies in its integration of data dimensionality reduction, co-association frequency computation, and definition of similar classes, providing a comprehensive revelation of the intrinsic structure during the clustering ensemble process. Algorithm 4 outlines the overall process of the three-way clustering framework, demonstrating how optimized clustering results are generated through multiple iterations to better reflect the characteristics of the original data.
The proposed approach offers a powerful tool for clustering ensemble, aiding in the precise capture of complex relationships and distribution patterns in clustering analysis. The three-way clustering framework provides valuable insights seeking to uncover intricate structures within their datasets.
4. Experimental Analyses
4.1. Data Descriptions
In this section, we conduct some experiments to evaluate the effectiveness of the proposed algorithm. We employ datasets from 13 UCI machine learning repositories [
60], spanning diverse domains such as biology, medicine, and finance. The detailed information about these datasets is presented in
Table 2, including the number of clusters and other relevant details. The software used for implementation includes MATLAB2019a for statistical and matrix computations and Python 3.9 with libraries such as NumPy, SciPy, and scikit-learn for data processing and machine learning tasks, ensuring robust and efficient analysis.
4.2. Evaluation Indices
- (1)
Adjusted Rand Index (ARI) [
61,
62] serves as a prominent external metric for assessing clustering performance in comparison to ground truth labels. The ARI, an extension of the Rand Index (RI), is designed to overcome the limitations of the RI by adjusting for chance agreements.
ARI adjusts the RI using the following formula:
where
represents the expected Rand Index under random conditions. The Rand Index (RI) is calculated by the formula:
: the number of sample pairs that belong to the same cluster in both the ground truth and clustering results.
: the number of sample pairs that belong to different clusters in both the ground truth and clustering results.
: the number of sample pairs that belong to the same cluster in the ground truth but to different clusters in the results.
: the number of sample pairs that belong to different clusters in the ground truth but to the same cluster in the results.
ARI values provide insights into the agreement between clustering results and ground truth labels, with 1 indicating perfect agreement, 0 suggesting performance no better than random assignment, and negative values indicating worse than random allocation. The introduction of ARI offers a comprehensive and objective means for evaluating clustering algorithms, facilitating a more accurate understanding of their performance.
- (2)
Adjusted Mutual Information (AMI) [
63,
64] is an internal metric commonly used to assess the performance of clustering results. It is designed to measure the similarity between clustering results and a ground truth (typically, actual labels) by quantifying the information gain between two distributions.
The computation of AMI involves the following formula:
where
represents the mutual information between
and
.
is the expected mutual information under random conditions.
and
are the entropies of
and
, respectively.
The numerator of AMI is an adjusted value of mutual information, while the denominator is an adjusted value of entropy. The values of AMI range from , where 1 indicates a perfect match, 0 denotes random matching, and negative values signify matching below random levels.
- (3)
Accuracy (ACC) [
65] is a common metric used to assess the performance of a classification model. It measures the proportion of samples that the model correctly classifies and serves as a simple and intuitive performance indicator. The formula for calculating ACC is as follows:
where
(True Positives) represents the number of samples correctly classified as the positive class,
(True Negatives) represents the number of samples correctly classified as the negative class,
(False Positives) represents the number of samples actually belonging to the negative class but misclassified as the positive class,
(False Negatives) represents the number of samples actually belonging to the positive class but misclassified as the negative class.
The range of ACC is , where 1 indicates perfect classification and 0 indicates classification failure. While ACC is an intuitive and easy-to-understand metric, it may have limitations when dealing with class imbalance.
4.3. Experimental Performances
Firstly, the PCA dimensionality reduction method is applied to high-dimensional datasets to obtain processed low-dimensional data. Subsequently, a clustering ensemble strategy is employed for the low-dimensional data. This involves randomly sampling subsets of data and features and running the traditional k-means clustering strategy for 50 iterations on all datasets. Then, an automatic hierarchical clustering method is used to form the clustering structure, and the merged results can be visualized using a dendrogram. Finally, the upper and lower approximations of similar classes are derived, and the core and fringe regions of each cluster are determined. Additionally, similarity threshold is 0.7 in the experiments.
Because NMI, ARI, and ACC are only adopted to the hard clustering results, three-way clustering results cannot calculate these values directly. In order to present the performances of our proposed algorithm, this study uses the core regions to form a clustering result, then calculate the NMI, ARI, and ACC by using the core region to represent the corresponding cluster. The clustering ensemble strategy is executed 50 times on all datasets, with an ensemble size of 50, to calculate the average NMI, ARI, and ACC values. The performances of the proposed algorithm on these three indicators are displayed in
Table 3 and
Figure 5,
Figure 6 and
Figure 7. To compare clustering effects, the performances of k-means, FCM, and DBSCAN are also presented in
Table 3 and
Figure 5,
Figure 6 and
Figure 7. The best performances for each dataset are highlighted in bold.
Table 3.
The performances of different algorithms.
Table 3.
The performances of different algorithms.
Datasets | Algorithm | ARI | AMI | ACC |
---|
Seeds | K-means | 0.7500 | 0.7054 | 0.9095 |
FCM | 0.7161 | 0.6915 | 0.8952 |
DBSCAN | 0.7021 | 0.4396 | 0.3667 |
Ours | 0.8198 | 0.7685 | 0.9356 |
Credit | K-means | 0.0091 | 0.032 | 0.3741 |
FCM | 0.0272 | 0.0317 | 0.3917 |
DBSCAN | 0.0110 | 0.0006 | 0.389 |
Ours | 0.1116 | 0.1073 | 0.3987 |
Ionosphere | K-means | 0.011 | 0.0006 | 0.5783 |
FCM | 0.1713 | 0.1272 | 0.7094 |
DBSCAN | 0.2174 | 0.1426 | 0.3932 |
Ours | 0.2500 | 0.2017 | 0.7833 |
Libras | K-means | 0.1837 | 0.1842 | 0.3389 |
FCM | 0.0597 | 0.33 | 0.1778 |
DBSCAN | 0.0025 | 0.2215 | 0.1000 |
Ours | 0.5193 | 0.7144 | 0.6182 |
Ecoil | K-means | 0.4542 | 0.5709 | 0.5565 |
FCM | 0.3679 | 0.5619 | 0.497 |
DBSCAN | 0.0080 | 0.005 | 0.4256 |
Ours | 0.3937 | 0.4999 | 0.6100 |
Segmentation | K-means | 0.0331 | 0.0736 | 0.2455 |
FCM | 0.3875 | 0.5062 | 0.6100 |
DBSCAN | 0.1067 | 0.3301 | 0.2939 |
Ours | 0.5501 | 0.6996 | 0.6720 |
Thyroid | K-means | 0.2145 | 0.3911 | 0.5721 |
FCM | 0.4294 | 0.176 | 0.786 |
DBSCAN | 0.3123 | 0.0356 | 0.4465 |
Ours | 0.5964 | 0.5628 | 0.8950 |
Wdbc | K-means | 0.0019 | 0.0052 | 0.5202 |
FCM | 0.7299 | 0.6138 | 0.9279 |
DBSCAN | 0.0274 | 0.0145 | 0.6098 |
Ours | 0.6441 | 0.5295 | 0.9320 |
Wine | K-means | 0.4483 | 0.4485 | 0.6461 |
FCM | 0.3492 | 0.4075 | 0.6854 |
DBSCAN | 0.2700 | 0.3137 | 0.5169 |
Ours | 0.5831 | 0.6674 | 0.8118 |
- (1).
By comparing the performance of our proposed three-way clustering algorithm with traditional clustering methods, such as k-means, FCM (Fuzzy C-Means), and DBSCAN (Density-Based Spatial Clustering of Applications with Noise), on AMI, ARI, and ACC, it can be found that our proposed algorithm demonstrates significant advantages on most datasets. Taking the Libras dataset as an example, after running the proposed algorithm, the resulting AMI, ARI, and ACC values are 0.5193, 0.7144, and 0.6182, respectively. In contrast, the AMI, ARI, and ACC values for the traditional k-means algorithm are only 0.1837, 0.1842, and 0.3389, respectively. This improvement is attributed to the dimensionality reduction of original high-dimensional data, mapping it to a lower-dimensional space, thus reducing data complexity. The introduction of co-occurrence probability enables more precise delineation of similar classes, allocating data points to core and fringe regions, better capturing the inherent structure of the data.
- (2).
By comparing the proposed three-way clustering algorithm with other algorithms in terms of AMI, ARI, and ACC, we observed significant improvements in the proposed algorithm relative to others. Specifically, across all datasets, the proposed algorithm exhibited an average improvement of approximately 20% to 30% in ARI and ACC, and an average increase of about 15% to 35% in AMI. There are several potential reasons behind these improvements. Firstly, the proposed three-way clustering algorithm adopts an ensemble strategy, integrating concepts of data dimensionality reduction, co-occurrence frequencies, and similarity classes, thereby offering a more comprehensive consideration of the inherent structure of the data. Secondly, leveraging the single-linkage method of hierarchical clustering, the proposed three-way clustering algorithm effectively captures the degree of correlation among data points, resulting in more precise classification of data points into clusters. Additionally, by selecting the clustering result with the highest lifetime as the final merged result, the proposed three-way algorithm ensures the stability and consistency of the clustering results, rendering it more suitable for various data types and complex structures. The suboptimal performance on the Wdbc dataset may be due to algorithm sensitivity to different parameter settings, and parameter selection may vary across different datasets. Although our proposed algorithm shows significant improvements, certain algorithms may perform better under specific conditions due to their inherent characteristics. For example, algorithms like DBSCAN are particularly effective for datasets with noise and density variations, while hierarchical clustering can capture nested cluster structures. By comparing the actual runtime with the computational time complexity, it is concluded that the proposed algorithm strikes a balance between accuracy and computational efficiency. Although it is not the fastest, its robustness and ability to handle high-dimensional and noisy data make it a valuable tool in practical applications.
In summary, the proposed three-way clustering algorithm amalgamates ideas from data dimensionality reduction, co-occurrence frequency calculation, and similar class partitioning. Compared to traditional clustering algorithms, it demonstrates advantages in more nuanced data analysis and accurate clustering results, making it more feasible and effective in practical applications.
5. Conclusions
The theoretical contribution of this paper lies in the proposal of a novel three-way clustering framework that integrates dimensionality reduction, co-occurrence frequencies, and similarity classes with three-way clustering. The objective is to efficiently cluster heterogeneous data from multiple sources by leveraging inherent structural information. Initially, we employ principal component analysis (PCA) to reduce the dimensionality of the data, mapping high-dimensional data into a lower-dimensional space. This not only decreases computational complexity but also enhances clustering efficiency.
Subsequently, we introduce the concept of co-occurrence frequencies, considering the co-occurrence relationships between samples. By applying a threshold to the co-occurrence probability, samples are classified into similar classes, combined with the division into core and fringe regions. This ensures that the proposed algorithm not only accurately describes the intrinsic structure of the data but also exhibits robustness. The experimental results show that the proposed algorithm can improve clustering accuracy, particularly when dealing with complex data structures and significant noise interference. To further enhance the clustering process, we integrate these co-occurrence probabilities with a single-linkage hierarchical clustering method. This fusion enables us to construct a dendrogram that captures the similarity between different clusters. Lifecycle analysis is then employed to select the most stable clustering result, ensuring consistency and robustness.
The practical contribution of this paper is the improvement in clustering accuracy. Experimental results demonstrate that the proposed algorithm significantly enhances clustering precision, especially when handling complex data structures and substantial noise interference. This proves its practical effectiveness in various real-world scenarios. The method shows significant advantages across multiple datasets, highlighting its versatility and robustness in dealing with diverse and high-dimensional data. This adaptability makes it suitable for a wide range of applications, from bioinformatics to market segmentation.
Although the algorithm demonstrates significant advantages across multiple datasets during experimental validation, it does not consistently exhibit the expected improvements on certain specific datasets. This discrepancy may arise due to a partial mismatch between data characteristics and algorithm design, necessitating further exploration and refinement.
In future research, we will focus on the following aspects:
- (1).
Adaptability of parameter selection:
The subjective nature of parameter thresholds in the algorithm may impact the stability of experimental results. To enhance algorithm robustness, considering more objective and adaptive parameter selection methods to accommodate different dataset requirements and application scenarios is essential.
- (2).
Improving the Quality of Base Clustering:
The generation of base clustering using different feature subsets may lead to poor-quality results, negatively affecting the final ensemble clustering outcome. To enhance the quality of base clustering, we can employ automatic evaluation mechanisms based on the data’s intrinsic structure or utilize advanced clustering performance metrics. Additionally, introducing other methods such as setting evaluation functions will help eliminate the impact of low-quality base clustering, effectively improving the overall performance of ensemble clustering.
- (3).
Adaptation Improvements for Specific Datasets:
The observation that the algorithm did not consistently exhibit expected improvements on specific datasets suggests a potential mismatch between data characteristics and algorithm design. Further work can include adapting the algorithm specifically for certain datasets, enhancing its generality and adaptability.