Faculty Research, Scholarly, and Creative Activity

A fast incremental spectral clustering algorithm with cosine similarity

Ran Li, Alumni
Guangliang Chen, San Jose State University

Publication Date

1-1-2023

Document Type

Conference Proceeding

Publication Title

IEEE International Conference on Data Mining Workshops, ICDMW

DOI

10.1109/ICDMW60847.2023.00019

First Page

Last Page

Abstract

Spectral clustering is a popular and powerful clustering method but known to face two significant challenges: scalability and out-of-sample extension. In this paper, we extend the work of Chen (ICPR 2018) on the speed scalability of spectral clustering in the setting of cosine similarity to deal with massive or online data that are too large to be fully loaded into computer memory. We start by drawing a small batch of data from the full set and develop an efficient procedure that approximately learns from the sample both the nonlinear embedding and clustering map of spectral clustering with the cosine similarity. We then introduce an incremental approach to continuously refining them while sampling more batches of data. The combination of the two procedures leads to a streamlined memory-efficient algorithm that only uses a small number of batches of data (as they become available), with memory and computational costs that are independent of the size of the data. The final nonlinear embedding and clustering rule can be easily applied to the rest of the data as they are gradually loaded. Experiments are conducted on benchmark data to demonstrate the fast speed and good accuracy of the proposed algorithm. We conclude the paper by pointing out several future research directions.

Keywords

Cosine similarity, Incremental learning, Memory scalability, Spectral clustering, Speed scalability

Department

Mathematics and Statistics

Recommended Citation

Ran Li and Guangliang Chen. "A fast incremental spectral clustering algorithm with cosine similarity" IEEE International Conference on Data Mining Workshops, ICDMW (2023): 80-88. https://doi.org/10.1109/ICDMW60847.2023.00019

Link to Full Text

COinS

Faculty Research, Scholarly, and Creative Activity

A fast incremental spectral clustering algorithm with cosine similarity

Publication Date

Document Type

Publication Title

DOI

First Page

Last Page

Abstract

Keywords

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

A fast incremental spectral clustering algorithm with cosine similarity

Authors

Publication Date

Document Type

Publication Title

DOI

First Page

Last Page

Abstract

Keywords

Department

Recommended Citation

Share

Search

Browse All

Links