Algorithms

Journal Browser

► Journal Browser

Special Issue Information

Dear Colleagues,

Welcome to submit a paper that identifies a specific sub-problem of clustering and studies it experimentally with validated benchmark datasets from http://cs.uef.fi/sipu/datasets/ with varying number of clusters (A-sets), cluster overlap (S-sets), dimensionality (Dim- and G2-sets), structure (Birch sets) and unbalance (Unbalance set); or some other generated benchmark datasets suitable for the selected objective function. Submitted paper should strictly focus on one selected sub-problem:

- Which objective function?
- Algorithms for solving a given objective function
- Algorithms for detecting outliers
- Algorithms for imputating missing data
- Cluster validation

For example, an experimental study on the performance of several known objective functions within some general algorithmic framework, such as k-means or agglomerative merge-based algorithms; or studies focusing on the performance of several algorithms using the same objective functions. Papers presenting overall clustering solution without any specific focus will not be considered.

Both new ideas and comparative studies are welcome, as long as the focus is on the specific sub-problem and they provide significant new findings. Application case studies are also welcome if the focus is on algoritmic aspects: Which objective function and which algorithm work for that application, and which do not, with clear reasoning.

To evaluate the results, it is recommended to the following two measures (when applicable):

* Centroid Index at cluster level [Pat.Rec. 47(9) 2014]
* Adjusted Rand Index at object level [Pat.Rec. 45(6), 2012]

Prof. Dr. Pasi Fränti
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Clustering algorithms
Clustering objective function
Outlier detection
Data imputation
Cluster validation

Published Papers (2 papers)

Download All Papers

Research

21 pages, 5410 KiB

Open AccessArticle

Common Nearest Neighbor Clustering—A Benchmark

by Oliver Lemke and Bettina G. Keller

Algorithms 2018, 11(2), 19; https://doi.org/10.3390/a11020019 - 9 Feb 2018

Cited by 15 | Viewed by 8652

Abstract

Cluster analyses are often conducted with the goal to characterize an underlying probability density, for which the data-point density serves as an estimate for this probability density. We here test and benchmark the common nearest neighbor (CNN) cluster algorithm. This algorithm assigns a spherical neighborhood R to each data point and estimates the data-point density between two data points as the number of data points N in the overlapping region of their neighborhoods (step 1). The main principle in the CNN cluster algorithm is cluster growing. This grows the clusters by sequentially adding data points and thereby effectively positions the border of the clusters along an iso-surface of the underlying probability density. This yields a strict partitioning with outliers, for which the cluster represents peaks in the underlying probability density—termed core sets (step 2). The removal of the outliers on the basis of a threshold criterion is optional (step 3). The benchmark datasets address a series of typical challenges, including datasets with a very high dimensional state space and datasets in which the cluster centroids are aligned along an underlying structure (Birch sets). The performance of the CNN algorithm is evaluated with respect to these challenges. The results indicate that the CNN cluster algorithm can be useful in a wide range of settings. Cluster algorithms are particularly important for the analysis of molecular dynamics (MD) simulations. We demonstrate how the CNN cluster results can be used as a discretization of the molecular state space for the construction of a core-set model of the MD improving the accuracy compared to conventional full-partitioning models. The software for the CNN clustering is available on GitHub. Full article

(This article belongs to the Special Issue Clustering Algorithms 2017)

► Show Figures

Figure 1

516 KiB

Open AccessArticle

Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering

by Joonas Hämäläinen, Susanne Jauhiainen and Tommi Kärkkäinen

Algorithms 2017, 10(3), 105; https://doi.org/10.3390/a10030105 - 6 Sep 2017

Cited by 79 | Viewed by 9392

Abstract

Clustering is an unsupervised machine learning and pattern recognition method. In general, in addition to revealing hidden groups of similar observations and clusters, their number needs to be determined. Internal clustering validation indices estimate this number without any external information. The purpose of this article is to evaluate, empirically, characteristics of a representative set of internal clustering validation indices with many datasets. The prototype-based clustering framework includes multiple, classical and robust, statistical estimates of cluster location so that the overall setting of the paper is novel. General observations on the quality of validation indices and on the behavior of different variants of clustering algorithms will be given. Full article

(This article belongs to the Special Issue Clustering Algorithms 2017)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Clustering Algorithms 2017

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (2 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI