Benchmarking in Cluster Analysis: A Study on Spectral Clustering, DBSCAN, and K-Means

Publication Date

1-1-2021

Document Type

Conference Proceeding

Publication Title

Studies in Classification, Data Analysis, and Knowledge Organization

Volume

5

DOI

10.1007/978-3-030-60104-1_20

First Page

175

Last Page

185

Abstract

We perform a benchmarking study to identify the advantages and the drawbacks of Spectral Clustering and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). We compare the two methods with the classic K-means clustering. The methods are performed on five simulated and three real data sets. The obtained clustering results are compared using external and internal indices, as well as run times. Although there is not one method that performs best on all types of data sets, we find that DBSCAN should generally be reserved for non-convex data with well-separated clusters or for data with many outliers. Spectral Clustering has better overall performance but with higher instability of the results compared to K-means, and longer run time.

Keywords

DBSCAN, K-means, Spectral clustering

Department

Mathematics and Statistics

Share

COinS