"Document Clustering" by David Anastasiu and Andrea Tagarelli

Faculty Publications

Title

Document Clustering

Authors

David Anastasiu, San Jose State UniversityFollow
Andrea Tagarelli

Document Type

Article

Publication Date

November 2017

Publication Title

Wiley StatsRef: Statistics Reference Online

First Page

Last Page

DOI

10.1002/9781118445112.stat07973

Abstract

In a world flooded with information, document clustering is an important tool that can help categorize and extract insight from text collections. It works by grouping similar documents, while simultaneously discriminating between groups. In this article, we provide a brief overview of the principal techniques used to cluster documents, and introduce a series of novel deep-learning based methods recently designed for the document clustering task. In our overview, we point the reader to salient works that can provide a deeper understanding of the topics discussed.

Comments

This is the peer reviewed version of the following article: Anastasiu, D. C. and Tagarelli, A. (2017). Document Clustering. In Wiley StatsRef: Statistics Reference Online (eds N. Balakrishnan, T. Colton, B. Everitt, W. Piegorsch, F. Ruggeri and J. L. Teugels)., which has been published in final form at https://doi.org/10.1002/9781118445112.stat07973. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.

Recommended Citation

David Anastasiu and Andrea Tagarelli. "Document Clustering" Wiley StatsRef: Statistics Reference Online (2017): 1-11. https://doi.org/10.1002/9781118445112.stat07973

Download

Find in your library

Included in

Computer Engineering Commons

COinS