Nearest Descent, In-Tree, and Clustering
Abstract
:1. Introduction
- Unlike K-means and GMM, the proposed method does not suffer from the initialization problem;
- Unlike Single Linkage, the proposed method is not very sensitive to noise and outlier;
- Unlike K-means, AP, Complete Linkage, and Average Linkage, the proposed method is not very sensitive to the cluster shape;
- Compared with MeanShift, Ncut, and N-J-W, the proposed method has lower time complexity (being , where N denotes the number of samples in a dataset);
- Compared with DBSCAN, the proposed method is not very sensitive to the parameter setting and cluster scale;
- Unlike K-means, GMM, and MeanShift, the proposed method is not limited to handling numerical datasets.
- We reveal the nature of the common idea in determining the parent nodes, i.e., Nearest Descent (from the perspective of physically inspired clustering) or Nearest Ascent (from the perspective of density-based clustering), which helps make a direct connection with other gradient-descent-based (or gradient-ascent-based) clustering methods;
- We find a more effective way to define the edge weight such that the proposed method is more insensitive to data dimension, outlier, and cluster scale;
- We give more sophisticated definition on the parent nodes and strictly prove that the constructed graph is an in-tree (rather than a minimal spanning tree);
- We propose a visualized strategy to validate the effectiveness of each automatic edge cutting method, which may help increase the reliability of the clustering results in practice.
- We conduct extensive experiments to reveal the characteristics of the proposed method with regard to the sensitivity to noise, outlier, cluster scales, parameters, and sample order.
2. Related Work
2.1. Physically Inspired Clustering
2.2. Density-Based Clustering
- (P1)
- Sensitive to the parameter (i.e., the step length) in the iteration;
- (P2)
- Involving time-consuming iteration (the time-complexity is , where N denotes the size of the dataset and T the number of numerical iterations);
- (P3)
- Only applicable to numerical datasets (this is due to the nature of the numerical iteration of the method);
- (P4)
- Not always guaranteed to converge. (The iteration may oscillate around the optimum, a well-known problem for gradient ascent.)
- (P5)
- Sensitive to the result of density estimation or the kernel bandwidth (an under-smoothed density surface could lead to the over-partitioned clustering result, while an over-smoothed density surface could lead to the under-partitioned clustering result).
2.3. Hierarchical Clustering
3. Method
3.1. Overview of the Proposed Method
3.2. Basic Implementation
- Stage I: make the dataset organized into the In-Tree structure. This stage contains the following steps:
- -
- Step 1, compute the distance between each pair of points :
- -
- Step 2, compute the potential at each point :
- -
- Step 3, determine all the candidate parent node set of each point :
- -
- Step 4, determine the parent node of each point :
- -
- Step 5, build a directed graph, denoted as , for which the node set and the edge set . In Section 3.3.2, we will demonstrate that is an In-Tree, in which each point i has one and only one directed edge started from it and there is a particular node in , called the root node (denoted as r), for which . Note that (1) the min operation outside the parentheses of Equation (9) serves to guarantee the uniqueness of ; (2) the second term (i.e., ) of serves to select the nodes with smaller indexes among the nodes with same potential as node i. However, if there is no such node with the same potential as node i (this case is very common in practice), is actually determined by .
- Stage II: cut the inter-cluster edges. This stage contains the following steps:
- -
- Step 6, define the edge weight. First, we have the following observation: each edge in is associated with the following three features , , and , where
- *
- denotes the distance between the start and end nodes of (i.e., );
- *
- () denotes the magnitude of the potential at the start node of (accordingly, can be viewed as the density at the start node of );
- *
- denotes the number of points in the circle taking the start point of edge as the center and the distance between the start and end nodes of as the radius.
Since the inter-cluster edges are usually much longer than the intra-cluster edges, the distance could be used as the weight of the edge to determine the inter-cluster edges. However, such -based cutting is quite sensitive to outliers and cluster scales (we will demonstrate this in the experiments). To solve the above problem, we also propose other approaches to define the edge weight. Specifically, the weight of the edge can be also defined as one of the following variables and . - -
- Step 7, determine the inter-cluster edges. After defining the edge weight (by either , , or ), we then regard the edges with the largest edge weights as the inter-cluster edges, where M is the number of clusters. We will compare the above three different edge-weight definition methods in the experiments. Therefore, if the cluster number is pre-defined, the inter-cluster edges can be determined automatically. Otherwise, one can determine the inter-cluster edges interactively following the Decision-Graph-based strategy proposed by Rodriguez and Laio [29]. Specifically, one can represent each directed edge by two features such as and . Then, all the edges can be displayed in a 2D scatter plot (Figure 2a). Since only the inter-cluster edges have large values on both and , they will pop out in the scatter plot and thus can be interactively determined. Actually, similar results can be obtained if replacing by (Figure 2b) or (Figure 2c). Nevertheless, it is noteworthy that, in the experiments, we will not use the Decision-Graph-based cutting method to determine the inter-cluster edges, since sometimes it is hard for users to make a decision when the Decision Graph does not show very clear pop-out points. Instead, we will use the Decision Graph (referring to the 2D scatter plot based on features W and ) to validate the effectiveness of the automatic cutting methods (we will detail this in Section 4).
- -
- Step 8, remove the inter-cluster edges. Assume that are the indexes of the start nodes of the determined inter-cluster edges. Then, the inter-cluster edges are removed by setting (). Each node will become a new root node of the generated graph.
- Stage III: make cluster assignments. This stage contains the following steps:
- -
- Step 9, update the parent nodes. In each round of updating, the parent node of each node i () is updated in this way where denotes the parent node of node i in the t-th round of updating (), and . The updating stops in a certain round when the parent nodes of all nodes no longer change, which means each node is directly linked to its root node (i.e., at this moment, stores the index of the root node that node i reaches), as shown in Figure 1h.
- -
- Step 10, obtain the clustering result (a partition of X). Let , where r is the root node of . The original dataset X is divided into M clusters as follows: , where ().
Algorithm 1 ND-C |
Require:: test dataset; : kernel bandwidth; M: cluster number; |
Ensure:; // a partition of containing M clusters; |
|
- Step 1: denotes the ()-entry of the following matrix:
- Step 2:, , , , , , and ;
- Step 3:
- -
- ;
- -
- ;
- -
- ;
- -
- ;
- -
- ;
- -
- ;
- -
- ;
- Step 4:, , , , , , and ;
- Step 5: for ,
- -
- ;
- -
- ;
- -
- .
- Step 6:
- -
- , , , , , and ;
- -
- , , , , , and ;
- -
- , , , , , and ;
- Step 7: ; note that for the test data here, no matter which feature (, , or ) was used to define the weight of each edge , .
- Step 8: , , , , , , and ;
- Step 9: , , , , , , and ;
- Step 10: , , , and .
3.3. Analysis
3.3.1. Nearest Descent
- “Nearest” specifies the local choice (or criterion).
- “Descent” specifies the global direction (i.e., in the descending direction of potential), which serves as a constraint to the local choice.
3.3.2. In-Tree
- (a)
- Only one node has outdegree 0;
- (b)
- Any other node has outdegree 1;
- (c)
- There is no cycle in it;
- (d)
- It is a connected graph.
- Like MST and k-NNG, this physically inspired In-Tree follows well the cluster structure in the dataset.
- Although there are some inter-cluster edges in this physically inspired In-Tree, those edges are quite distinguishable, in contrast to the cases in MST and k-NNG.
- Removing any edge in the In-Tree will divide the In-Tree into two sub-graphs, each still being an In-Tree.
- There is one and only one directed path between each non-root node and the root node.
3.3.3. Time Complexity
3.3.4. Comparison with Three Similar Clustering Methods
- We find a more effective feature (i.e., feature ) to define the edge weight such that the proposed method is more insensitive to the data dimension, outlier, and cluster scale.
- We give more sophisticated definition on the parent nodes and strictly prove the constructed graph is an in-tree (rather than a minimal spanning tree).
- We combine the proposed method with RL’s method, using RL’s Decision Graph to validate the effectiveness of each automatic edge cutting method.
- We reveal the nature of the common idea, i.e., Nearest Ascent (from the perspective of density) or Nearest Descent (from the perspective of potential), which helps make a direct connection and significant comparison with the classical ones such as gradient-ascent-based density clustering methods;
- We introduce the graph structure, In-Tree, into the clustering field;
- We present a more general framework (i.e., first estimate the potential, then construct the In-Tree, and lastly remove the inter-cluster edges in the In-Tree), in which RL’s method can be viewed as a scatter-plot-based method to determine the inter-cluster edges.
4. Experiments
4.1. Comparison with Other Methods
- S-Link: Single Linkage hierarchical clustering method [61].
- A-Link: Average Linkage hierarchical clustering method (also called the unweighted pair group method with arithmetic means algorithm) [62].
- PHA: a fast potential-based hierarchical agglomerative clustering method [27]. In the experiments, we used two versions of PHA, denoted as PHA-1 and PHA-2, where
- -
- PHA-1 uses the default potential estimation method of PHA;
- -
- PHA-2 uses the same potential estimation as the proposed method.
- QuickShift: a density-based hierarchical agglomerative clustering method [28]. In the experiments, QuickShift used the same kernel (i.e., an exponential kernel) as the proposed method to estimate the density at each node.
- RL: a density-peak-based clustering method proposed by Rodriguez and Laio [29]. In the experiments, we used two versions of RL, denoted as RL-1 and RL-2, where
- -
- RL-1 uses the default Gaussian kernel to estimate the density;
- -
- RL-2 uses the same kernel function (i.e., the exponential kernel function) as the proposed method to estimate the density.
- DLORE_DP: a recent variant of RL [63].
- ND-C: the proposed method. We used three automatic versions of ND-C, denoted as ND-C(1), ND-C(2), and ND-C(3), which only differ in how they define the edge weight. Specifically,
- -
- For ND-C(1), the distance feature is used to define the weight of each edge in the constructed in-tree;
- -
- For ND-C(2), feature is used to define the weight of each edge in the constructed in-tree;
- -
- For ND-C(3), feature is used to define the weight of each edge in the constructed in-tree;
where , and are the three features associated with each edge in the constructed in-tree; see more details in Section 3.2 (stage II).
- ND-C(1), ND-C(2), and ND-C(3) show the same NMI and ARI scores on all the synthetic datasets (in italics). However, on the real-world datasets (which are high-dimensional datasets), ND-C(1) performs much worse than ND-C(3). However, by taking both distance and potential features into account, ND-C(2) performs overall better than ND-C(1), but it still performs overall worse than ND-C(3).
- Besides ND-C(1) and ND-C(2), ND-C(3) also achieves overall better performance than all the other compared methods.
4.2. Further Comparison of ND-C(1), ND-C(2), and ND-C(3) and The Usage of the Visualization Tool for Validation
5. Conclusions
6. Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ND | Nearest Descent |
NA | Nearest Ascent |
ND-C | Nearest Descent based Clustering |
HC | Hierarchical clustering |
CSPV | Clustering by Sorting Potential Values |
PHA | Potential-based Hierarchical Agglomerative |
GA-C | Hradient-Ascent-based Clustering |
GGA | Graph-theoretical-based Gradient Ascent |
RL | Rodriguez and Laio |
GD | Gradient Descent |
MST | Minimal Spanning Tree |
k-NNG | k-Nearest-Neighbor Graph |
S-Link | Single Linkage hierarchical clustering |
A-Link | Average Linkage hierarchical clustering |
NMI | Normalized Mutual Information |
ARI | Adjusted Rand Index |
Dim | Dimensions |
CN | Cluster Number |
CV | Compute Vision |
Appendix A. Density-Border-Based Clustering
Appendix B. About the Data Index
Appendix C. Proof of the In-Tree
- (P1)
- If G is a cycle, then, for any node , = 2. Typically, when the cycle is a directed cycle, .
- (P2)
- If G is a tree, then , where and denote the number of edges and the number of nodes, respectively.
- (P3)
- If G is a digraph, then , and.
- (F1)
- For each node i in , () is defined as the points with smaller potentials or the same potential but smaller data indexes;
- (F2)
- For each node i in with , its parent node is unique, i.e., ;
- (F3)
- For each node i in with , it has no parent node, i.e., .
- (i)
- Let be the globally minimal potential among the potentials at all nodes, i.e., . According to; F1, for each node i with , . If only one node i has the minimum potential value, then, according to F1, . If several nodes are of the minimum potential, the node with the smallest index among the above nodes has empty . In conclusion, there is always one and only one node, denoted as r, with , for which, according to F3, . The condition (a) is met.
- (ii)
- According to F1, for any other node , there will be at least one node (e.g., node r) which makes . Then, according to F2, we have . The condition (b) is met.
- (iii)
- Suppose there is a cycle in digraph . If is a directed cycle , then according to F1, the potential values for these nodes should meet , which is true only when . Then according to F1 again, the data indexes for these nodes should meet , resulting in , an obvious contradiction. If is not a directed cycle, we claim that there exists at least one node of , say w, then, according to P1, there will be either or in this cycle . If , then it will contradict with the result in (ii); If , since every node v on a cycle has degree 2, i.e., , and according to P3,
- (iv)
- Suppose is not connected, e.g., containing T connected sub-graphs , , ⋯, . Since digraph has no cycle, there should be no cycle in each subgraph as well. Hence, each subgraph is at least a tree. Assume the node r of outdegree 0 is in , then nodes in any other subgraph are all with outdegree 1, thus, according to P3,
Appendix D. NMI and ARI
References
- Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 31, 651–666. [Google Scholar] [CrossRef]
- Xu, R.; Wunsch, D. Survey of clustering algorithms. IEEE Trans. Neural Netw. 2005, 16, 645–678. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Theodoridis, S.; Koutroumbas, K. Pattern Recognition, 4th ed.; Elsevier: Amsterdam, The Netherlands, 2009. [Google Scholar]
- Handl, J.; Knowles, J.; Kell, D.B. Computational cluster validation in post-genomic data analysis. Bioinformatics 2005, 21, 3201–3212. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Macqueen, J. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1967; pp. 281–297. [Google Scholar]
- Frey, B.J.; Dueck, D. Clustering by passing messages between data points. Science 2007, 315, 972–976. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Eisen, M.B.; Spellman, P.T.; Brown, P.O.; Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 1998, 95, 14863–14868. [Google Scholar] [CrossRef] [Green Version]
- McLachlan, G.; Peel, D. Finite Mixture Models: Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons: New York, NY, USA, 2000. [Google Scholar]
- Shi, J.; Malik, J. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 888–905. [Google Scholar]
- Ng, A.Y.; Jordan, M.I.; Weiss, Y. On Spectral Clustering: Analysis and an algorithm. Proc. Adv. Neural Inf. Process. Syst. 2002, 14, 849–856. [Google Scholar]
- Cheng, Y. Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 1995, 17, 790–799. [Google Scholar] [CrossRef] [Green Version]
- Comaniciu, D.; Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 603–619. [Google Scholar] [CrossRef] [Green Version]
- Fukunaga, K.; Hostetler, L. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 1975, 21, 32–40. [Google Scholar] [CrossRef] [Green Version]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd ACM International Conference Knowledge Discovery and Data Mining; AAAI Press: Portland, OR, USA, 1996; Volume 96, pp. 226–231. [Google Scholar]
- Lin, F.; Cohen, W.W. Power iteration clustering. In Proceedings of the 27th International Conference Machine Learning, Haifa, Israel, 21–24 June 2010; pp. 655–662. [Google Scholar]
- Carreira-Perpinan, M.A. Acceleration strategies for Gaussian mean-shift image segmentation. In Proceedings of the Conference Computer Vision and Pattern Recognition, New York, NY, USA, 17–22 June 2006; pp. 1160–1167. [Google Scholar]
- Elgammal, A.; Duraiswami, R.; Davis, L.S. Efficient kernel density estimation using the fast gauss transform with applications to color modeling and tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2003, 25, 1499–1504. [Google Scholar] [CrossRef]
- Georgescu, B.; Shimshoni, I.; Meer, P. Mean shift based clustering in high dimensions: A texture classification example. In Proceedings of the 9th IEEE International Conference Computer Vision, Los Alamitos, CA, USA, 13–16 October 2003; pp. 456–463. [Google Scholar]
- Paris, S.; Durand, F. A topological approach to hierarchical segmentation using mean shift. In Proceedings of the IEEE Conference Computer Vision and Pattern Recognition, Minneapolis, MA, USA, 18–23 June 2007; pp. 1–8. [Google Scholar]
- Ertöz, L.; Steinbach, M.; Kumar, V. Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In Proceedings of the 3rd SIAM International Conference Data Mining, San Francisco, CA, USA, 1–3 May 2003; pp. 47–58. [Google Scholar]
- Pei, T.; Jasra, A.; Hand, D.J.; Zhu, A.X.; Zhou, C. DECODE: A new method for discovering clusters of different densities in spatial data. Data Min. Knowl. Discov. 2009, 18, 337–369. [Google Scholar] [CrossRef]
- Kriegel, H.P.; Kröger, P.; Sander, J.; Zimek, A. Density-based clustering. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2011, 1, 231–240. [Google Scholar] [CrossRef]
- Kundu, S. Gravitational clustering: A new approach based on the spatial distribution of the points. Pattern Recognit. 1999, 32, 1149–1160. [Google Scholar] [CrossRef]
- Gomez, J.; Dasgupta, D.; Nasraoui, O. A new gravitational clustering algorithm. In Proceedings of the 3rd SIAM International Conference Data Mining, San Francisco, CA, USA, 1–3 May 2003; pp. 83–94. [Google Scholar]
- Sanchez, M.A.; Castillo, O.; Castro, J.R.; Melin, P. Fuzzy granular gravitational clustering algorithm for multivariate data. Inf. Sci. 2014, 279, 498–511. [Google Scholar] [CrossRef]
- Bahrololoum, A.; Nezamabadi-pour, H.; Saryazdi, S. A data clustering approach based on universal gravity rule. Eng. Appl. Artif. Intell. 2015, 45, 415–428. [Google Scholar] [CrossRef]
- Lu, Y.; Wan, Y. PHA: A fast potential-based hierarchical agglomerative clustering method. Pattern Recognit. 2013, 46, 1227–1239. [Google Scholar] [CrossRef]
- Vedaldi, A.; Soatto, S. Quick Shift and Kernel Methods for Mode Seeking. In Proceedings of the 10th European Conference Computer Vision, Marseille, France, 12–18 October 2008; pp. 705–718. [Google Scholar]
- Rodriguez, A.; Laio, A. Clustering by fast search and find of density peaks. Science 2014, 344, 1492–1496. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wright, W.E. Gravitational clustering. Pattern Recognit. 1977, 9, 151–166. [Google Scholar] [CrossRef]
- Wang, Z.; Yu, Z.; Chen, C.P.; You, J.; Gu, T.; Wong, H.S.; Zhang, J. Clustering by Local Gravitation. IEEE Trans. Cybern. 2018, 48, 1383–1396. [Google Scholar] [CrossRef]
- Lu, Y.; Wan, Y. Clustering by Sorting Potential Values (CSPV): A novel potential-based clustering method. Pattern Recogn. 2012, 45, 3512–3522. [Google Scholar] [CrossRef]
- Ruta, D.; Gabrys, B. A framework for machine learning based on dynamic physical fields. Nat. Comput. 2009, 8, 219–237. [Google Scholar] [CrossRef]
- Menardi, G. A review on modal clustering. Int. Stat. Rev. 2016, 84, 413–433. [Google Scholar] [CrossRef]
- Hinneburg, A.; Keim, D.A. An efficient approach to clustering in large multimedia databases with noise. In Proceedings of the 4th ACM International Conference Knowledge Discovery and Data Mining, New York, NY, USA, 27–31 August 1998; Volume 98, pp. 58–65. [Google Scholar]
- Koontz, W.L.; Narendra, P.M.; Fukunaga, K. A graph-theoretic approach to nonparametric cluster analysis. IEEE Trans. Comput. 1976, 100, 936–944. [Google Scholar] [CrossRef]
- Müllner, D. fastcluster: Fast hierarchical, agglomerative clustering routines for R and Python. J. Stat. Softw. 2013, 53, 1–18. [Google Scholar] [CrossRef] [Green Version]
- Tenenbaum, J.B.; De Silva, V.; Langford, J.C. A global geometric framework for nonlinear dimensionality reduction. Science 2000, 290, 2319–2323. [Google Scholar] [CrossRef]
- Gionis, A.; Mannila, H.; Tsaparas, P. Clustering aggregation. ACM Trans. Knowl. Discov. Data 2007, 1, 1–30. [Google Scholar] [CrossRef] [Green Version]
- Gross, J.L.; Yellen, J. Handbook of Graph Theory; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
- Zahn, C.T. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. 1971, 100, 68–86. [Google Scholar] [CrossRef] [Green Version]
- Karypis, G.; Han, E.H.; Kumar, V. Chameleon: Hierarchical clustering using dynamic modeling. Computer 1999, 32, 68–75. [Google Scholar] [CrossRef] [Green Version]
- Xu, Y.; Olman, V.; Xu, D. Clustering gene expression data using a graph-theoretic approach: An application of minimum spanning trees. Bioinformatics 2002, 18, 536–545. [Google Scholar] [CrossRef] [Green Version]
- Franti, P.; Virmajoki, O.; Hautamaki, V. Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 1875–1881. [Google Scholar] [CrossRef] [PubMed]
- Wieland, S.C.; Brownstein, J.S.; Berger, B.; Mandl, K.D. Density-equalizing Euclidean minimum spanning trees for the detection of all disease cluster shapes. Proc. Natl. Acad. Sci. USA 2007, 104, 9404–9409. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cannistraci, C.V.; Ravasi, T.; Montevecchi, F.M.; Ideker, T.; Alessio, M. Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes. Bioinformatics 2010, 26, i531–i539. [Google Scholar] [CrossRef] [PubMed]
- Zhong, C.; Miao, D.; Wang, R. A graph-theoretical clustering method based on two rounds of minimum spanning trees. Pattern Recognit. 2010, 43, 752–766. [Google Scholar] [CrossRef]
- Zhong, C.; Miao, D.; Fränti, P. Minimum spanning tree based split-and-merge: A hierarchical clustering method. Inf. Sci. 2011, 181, 3397–3410. [Google Scholar] [CrossRef]
- Yu, Z.; Liu, W.; Liu, W.; Peng, X.; Hui, Z.; Kumar, B.V.K.V. Generalized transitive distance with minimum spanning random forest. In Proceedings of the 24th International Joint Conference Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015; pp. 2205–2211. [Google Scholar]
- Yu, Z.; Liu, W.; Liu, W.; Yang, Y.; Li, M.; Kumar, B.V. On Order-Constrained Transitive Distance Clustering. In Proceedings of the 30th AAAI Conference Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2293–2299. [Google Scholar]
- Preuss, M.; Schönemann, L.; Emmerich, M. Counteracting genetic drift and disruptive recombination in (μ,+λ)-EA on multimodal fitness landscapes. In Proceedings of the 7th Annual Conference Genetic and Evolutionary Computation. ACM, Washington, DC, USA, 25–29 June 2005; pp. 865–872. [Google Scholar]
- Blake, C.; Merz, C. UCI Repository of Machine Learning Databases. 1998. Available online: Https://archive.ics.uci.edu/ml/index.php (accessed on 3 February 2022).
- Assfalg, M.; Bertini, I.; Colangiuli, D.; Luchinat, C.; Schäfer, H.; Schütz, B.; Spraul, M. Evidence of different metabolic phenotypes in humans. Proc. Natl. Acad. Sci. USA 2008, 105, 1420–1424. [Google Scholar] [CrossRef] [Green Version]
- Fu, L.; Medico, E. FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform. 2007, 8, 3. [Google Scholar] [CrossRef]
- Chang, H.; Yeung, D.Y. Robust path-based spectral clustering. Pattern Recognit. 2008, 41, 191–203. [Google Scholar] [CrossRef]
- Fränti, P.; Sieranoja, S. K-means properties on six clustering benchmark datasets. Appl. Intell. 2018, 48, 4743–4759. [Google Scholar] [CrossRef]
- Veenman, C.J.; Reinders, M.J.T.; Backer, E. A maximum variance cluster algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 1273–1280. [Google Scholar] [CrossRef] [Green Version]
- Samaria, F.; Harter, A. Parameterisation of a stochastic model for human face identification. In Proceedings of the 1994 IEEE Workshop on Applications of Computer Vision, Sarasota, FL, USA, 5–7 December 1994; pp. 138–142. [Google Scholar]
- Nene, S.A.; Nayar, S.K.; Murase, H. Columbia Object Image Library (COIL-20); Technical Report, Technical Report CUCS-005-96; 1996; Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.641.1322&rep=rep1&type=pdf (accessed on 3 February 2022).
- Nene, S.A.; Nayar, S.K.; Murase, H. Columbia Object Image Library (COIL-100); Technical Report, Technical Report CUCS-006-96; 1996; Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.360.6420&rep=rep1&type=pdf (accessed on 3 February 2022).
- Sneath, P.H. The application of computers to taxonomy. Microbiology 1957, 17, 201–226. [Google Scholar] [CrossRef] [Green Version]
- Sneath, P.H.; Sokal, R.R. Numerical Taxonomy. The Principles and Practice of Numerical Classification; W. H. Freeman: San Francisco, CA, USA, 1973. [Google Scholar]
- Cheng, D.; Zhang, S.; Huang, J. Dense members of local cores-based density peaks clustering algorithm. Knowl. Based Syst. 2020, 193, 105454. [Google Scholar] [CrossRef]
- Kvalseth, T.O. Entropy and correlation: Some comments. IEEE Trans. Syst. Man Cybern. 1987, 17, 517–519. [Google Scholar] [CrossRef]
- Hubert, L.; Arabie, P. Comparing partitions. J. Classif. 1985, 2, 193–218. [Google Scholar] [CrossRef]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Shneiderman, B. The big picture for big data: Visualization. Science 2014, 343, 730. [Google Scholar] [CrossRef]
- Hartigan, J.A.; Hartigan, J. Clustering Algorithms; Wiley: New York, NY, USA, 1975; Volume 209. [Google Scholar]
- Ankerst, M.; Breunig, M.M.; Kriegel, H.P.; Sander, J. OPTICS: Ordering points to identify the clustering structure. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, Philadelphia, PA, USA, 1–3 June 1999; pp. 49–60. [Google Scholar]
- Campello, R.J.; Moulavi, D.; Zimek, A.; Sander, J. Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans. Knowl. Discov. Data 2015, 10, 5. [Google Scholar] [CrossRef]
- Sander, J.; Qin, X.; Lu, Z.; Niu, N.; Kovarsky, A. Automatic extraction of clusters from hierarchical clustering representations. In Pacific-Asia Conference on Knowledge Discovery and Data Mining; Springer: Berlin/Heidelberg, Germany, 2003; pp. 75–87. [Google Scholar]
- McInnes, L.; Healy, J. Accelerated Hierarchical Density Clustering. arXiv 2017, arXiv:1705.07321. [Google Scholar]
- Gross, J.L.; Yellen, J. Graph Theory and Its Applications; CRC Press: Boca Raton, FL, USA, 2005. [Google Scholar]
Notation | Description |
---|---|
Test dataset (); | |
X | Index set of (); |
Size of a set; | |
N | Size of dataset ; |
M | Number of clusters |
Distance between samples and ; | |
Potential at node i; | |
Density at node i; | |
Candidate parent node set; | |
Parent node of node i; | |
An in-tree; | |
Node set of a graph; | |
Edge set of a graph; | |
Directed edge started from node i and ended at node ; | |
The magnitude of the potential at the start node of (); | |
Distance () between the start and end nodes of (); | |
The number of nodes in the circle taking the start node of edge
() as the center and the distance between the start and end nodes of as the radius; | |
The i-th cluster; | |
A partition of ; | |
Kernel bandwidth. In the experiments, we fixed to the percentile of all the pair-wise distances. |
Method | Parent Node () |
---|---|
QuickShift | |
RL | |
NA |
Index | Data Name | Type | N | Dim | CN | Source | Field |
---|---|---|---|---|---|---|---|
1 | 2G_e | Synthetic | 600 | 2 | 2 | TS | — |
2 | 3G | Synthetic | 300 | 2 | 3 | TS | — |
3 | AGG | Synthetic | 788 | 2 | 7 | [39] | — |
4 | Flame | Synthetic | 240 | 2 | 2 | [54] | — |
5 | Spiral | Synthetic | 312 | 2 | 3 | [55] | — |
6 | 2G_u | Synthetic | 1050 | 2 | 2 | TS | — |
7 | S1 | Synthetic | 5000 | 2 | 15 | [56] | — |
8 | S2 | Synthetic | 5000 | 2 | 15 | [56] | — |
9 | UB | Synthetic | 6500 | 2 | 8 | [56] | — |
10 | R15 | Synthetic | 600 | 2 | 15 | [57] | — |
11 | Banknote | RW | 1372 | 4 | 2 | [52] | CV |
12 | OrlFace | RW | 400 | 10304 | 40 | [58] | CV |
13 | Coil20 | RW | 1440 | 1024 | 20 | [59] | CV |
14 | Mfeat | RW | 2000 | 649 | 10 | [52] | CV |
15 | MetRef | RW | 873 | 375 | 22 | [53] | Biology |
16 | USPS | RW | 11,000 | 256 | 10 | [52] | CV |
17 | Pendigits | RW | 10,992 | 16 | 10 | [52] | CV |
18 | COIL100 | RW | 7200 | 1024 | 100 | [60] | CV |
S-Link | A-Link | PHA | PHA-2 | QuickShift | RL-1 | RL-2 | DLORE_DP | ND-C(1) | ND-C(2) | ND-C(3) | |
---|---|---|---|---|---|---|---|---|---|---|---|
2G_e | 0.00 | 0.00 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 |
3G | 0.58 | 0.58 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 |
AGG | 0.80 | 0.99 | 1.00 | 1.00 | 1.00 | 0.87 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Flame | 0.01 | 0.47 | 0.46 | 1.00 | 1.00 | 0.40 | 1.00 | 0.90 | 1.00 | 1.00 | 1.00 |
Spiral | 1.00 | 0.00 | 1.00 | 1.00 | 1.00 | 0.70 | 1.00 | 0.63 | 1.00 | 1.00 | 1.00 |
2G_u | 0.00 | 0.53 | 0.96 | 0.96 | 0.96 | 0.02 | 0.96 | 0.81 | 0.96 | 0.96 | 0.96 |
S1 | 0.66 | 0.98 | 0.99 | 0.99 | 0.99 | 0.95 | 0.99 | 0.91 | 0.99 | 0.99 | 0.99 |
S2 | 0.00 | 0.93 | 0.93 | 0.94 | 0.94 | 0.90 | 0.94 | 0.64 | 0.94 | 0.94 | 0.94 |
UB | 0.98 | 1.00 | 1.00 | 1.00 | 1.00 | 0.65 | 1.00 | 0.74 | 1.00 | 1.00 | 1.00 |
R15 | 0.80 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |
Banknote | 0.00 | 0.09 | 0.03 | 0.17 | 0.17 | 0.05 | 0.93 | 0.02 | 0.17 | 0.93 | 0.93 |
OrlFace | 0.41 | 0.77 | 0.57 | 0.57 | 0.57 | 0.74 | 0.83 | 0.71 | 0.57 | 0.83 | 0.89 |
Coil20 | 0.39 | 0.47 | 0.44 | 0.48 | 0.48 | 0.41 | 0.72 | 0.47 | 0.48 | 0.72 | 0.71 |
Mfeat | 0.01 | 0.55 | 0.58 | 0.55 | 0.55 | 0.57 | 0.67 | 0.60 | 0.55 | 0.67 | 0.76 |
MetRef | 0.02 | 0.03 | 0.02 | 0.02 | 0.02 | 0.37 | 0.39 | 0.05 | 0.02 | 0.39 | 0.52 |
USPS | 0.00 | 0.04 | 0.00 | 0.00 | 0.00 | 0.28 | 0.00 | 0.23 | 0.00 | 0.00 | 0.27 |
Pendigits | 0.00 | 0.57 | 0.45 | 0.45 | 0.45 | 0.58 | 0.60 | 0.64 | 0.45 | 0.60 | 0.70 |
COIL100 | 0.08 | 0.42 | 0.09 | 0.09 | 0.09 | 0.37 | 0.52 | 0.64 | 0.09 | 0.52 | 0.70 |
S-Link | A-Link | PHA-1 | PHA-2 | QuickShift | RL-1 | RL-2 | DLORE_DP | ND-C(1) | ND-C(2) | ND-C(3) | |
---|---|---|---|---|---|---|---|---|---|---|---|
2G_e | 0.00 | 0.00 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |
3G | 0.57 | 0.56 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 |
AGG | 0.80 | 0.99 | 1.00 | 1.00 | 1.00 | 0.71 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Flame | 0.01 | 0.44 | 0.43 | 1.00 | 1.00 | 0.33 | 1.00 | 0.95 | 1.00 | 1.00 | 1.00 |
Spiral | 1.00 | 0.00 | 1.00 | 1.00 | 1.00 | 0.68 | 1.00 | 0.63 | 1.00 | 1.00 | 1.00 |
2G_u | 0.00 | 0.74 | 0.99 | 0.99 | 0.99 | −0.06 | 0.99 | 0.92 | 0.99 | 0.99 | 0.99 |
S1 | 0.46 | 0.98 | 0.99 | 0.99 | 0.99 | 0.91 | 0.99 | 0.82 | 0.99 | 0.99 | 0.99 |
S2 | 0.00 | 0.91 | 0.91 | 0.93 | 0.93 | 0.85 | 0.93 | 0.30 | 0.93 | 0.93 | 0.93 |
UB | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.60 | 1.00 | 0.71 | 1.00 | 1.00 | 1.00 |
R15 | 0.54 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |
Banknote | 0.00 | 0.10 | 0.01 | 0.13 | 0.13 | −0.01 | 0.96 | 0.03 | 0.13 | 0.96 | 0.96 |
OrlFace | 0.04 | 0.41 | 0.14 | 0.14 | 0.14 | 0.38 | 0.53 | 0.42 | 0.14 | 0.53 | 0.67 |
Coil20 | 0.14 | 0.22 | 0.19 | 0.22 | 0.22 | 0.18 | 0.46 | 0.12 | 0.22 | 0.46 | 0.43 |
Mfeat | 0.00 | 0.40 | 0.44 | 0.41 | 0.41 | 0.40 | 0.55 | 0.47 | 0.41 | 0.55 | 0.63 |
MetRef | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.13 | 0.16 | 0.02 | 0.00 | 0.16 | 0.29 |
USPS | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.14 | 0.00 | 0.06 | 0.00 | 0.00 | 0.12 |
Pendigits | 0.00 | 0.41 | 0.33 | 0.33 | 0.33 | 0.40 | 0.47 | 0.49 | 0.33 | 0.47 | 0.51 |
COIL100 | 0.00 | 0.05 | 0.00 | 0.00 | 0.00 | 0.07 | 0.08 | 0.11 | 0.00 | 0.08 | 0.15 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qiu, T.; Li, Y. Nearest Descent, In-Tree, and Clustering. Mathematics 2022, 10, 764. https://doi.org/10.3390/math10050764
Qiu T, Li Y. Nearest Descent, In-Tree, and Clustering. Mathematics. 2022; 10(5):764. https://doi.org/10.3390/math10050764
Chicago/Turabian StyleQiu, Teng, and Yongjie Li. 2022. "Nearest Descent, In-Tree, and Clustering" Mathematics 10, no. 5: 764. https://doi.org/10.3390/math10050764
APA StyleQiu, T., & Li, Y. (2022). Nearest Descent, In-Tree, and Clustering. Mathematics, 10(5), 764. https://doi.org/10.3390/math10050764