Publication Date

3-1-2022

Document Type

Article

Publication Title

Stats

Volume

Issue

DOI

10.3390/stats5010001

First Page

Last Page

Abstract

Cluster analysis seeks to assign objects with similar characteristics into groups called clusters so that objects within a group are similar to each other and dissimilar to objects in other groups. Spectral clustering has been shown to perform well in different scenarios on continuous data: it can detect convex and non-convex clusters, and can detect overlapping clusters. However, the constraint on continuous data can be limiting in real applications where data are often of mixed-type, i.e., data that contains both continuous and categorical features. This paper looks at extending spectral clustering to mixed-type data. The new method replaces the Euclidean-based similarity distance used in conventional spectral clustering with different dissimilarity measures for continuous and categorical variables. A global dissimilarity measure is than computed using a weighted sum, and a Gaussian kernel is used to convert the dissimilarity matrix into a similarity matrix. The new method includes an automatic tuning of the variable weight and kernel parameter. The performance of spectral clustering in different scenarios is compared with that of two state-of-the-art mixed-type data clustering methods, k-prototypes and KAMILA, using several simulated and real data sets.

Keywords

cluster analysis, mixed-type data, spectral clustering

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Department

Mathematics and Statistics

Recommended Citation

Felix Mbuga and Cristina Tortora. "Spectral Clustering of Mixed-Type Data" Stats (2022): 1-11. https://doi.org/10.3390/stats5010001

Download

Find in your library

COinS

Faculty Research, Scholarly, and Creative Activity

Spectral Clustering of Mixed-Type Data

Publication Date

Document Type

Publication Title

Volume

Issue

DOI

First Page

Last Page

Abstract

Keywords

Creative Commons License

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

Spectral Clustering of Mixed-Type Data

Authors

Publication Date

Document Type

Publication Title

Volume

Issue

DOI

First Page

Last Page

Abstract

Keywords

Creative Commons License

Department

Recommended Citation

Share

Search

Browse All

Links