*3.1. DBSCAN*

The DBSCAN (density-based spatial clustering of applications with noise) [26] algorithm is the precursor of all density-based clustering algorithms. It was developed to process large datasets with the inherent presence of noise. DBSCAN is capable of discriminating the noise points of a dataset and can detect clusters of any shape with no previous information about the number of expected clusters. Shortly, DBSCAN leverages the concepts of *core points*, *density-reachability*, and *density-connectivity*. Given two parameters and *minPts*, a point is a *core point* if there are at least *minPts* points in its neighborhood of radius (-neighborhood). A point *p* is directly density-reachable from a point *q* if *q* is a core point and *p* is in *q*'s -neighborhood. Two points *p* and *q* are *density-reachable* if there exists a chain of directly density-reachable points that connect *q* and *p*. Finally, two points *p* and *q* are *density-connected* if there exists a core point *o* such that *p* and *q* are density-reachable from *o*.

DBSCAN builds a cluster of points by iteratively connecting a couple of points that are density-connected, and all the points that are density-reachable from a point of the cluster are in the same cluster. All points that do not belong to any cluster, and thus are not density-connected to any other point, are considered noise points. The DBSCAN can process a dataset of size *n* in *O*(*nlogn*) time if exploiting a proper indexing structure on the data for executing the search for the -neighborhood. It is worth noting that, given the definitions above, it is clear that DBSCAN can detect clusters having at least a specific predetermined density ( *minPts πr*<sup>2</sup> ), directly determined by the and *minPts* parameters. For such a reason, it can fail to detect clusters characterized by different densities.
