Fall 2019

Master's Project

Master of Science (MS)


Computer Science

Robert Chun

Katerina Potika

Manasi Thakur


Multi-document Text Summarization, LSA, Text-Rank, Lex-Rank, RBM


Text summarization has been a long studied topic in the field of natural language processing. There have been various approaches for both extractive text summarization as well as abstractive text summarization. Summarizing texts for a single document is a methodical task. But summarizing multiple documents poses as a greater challenge. This thesis explores the application of Latent Semantic Analysis, Text-Rank, Lex-Rank and Reduction algorithms for single document text summarization and compares it with the proposed approach of creating a hybrid system combining each of the above algorithms, individually, with Restricted Boltzmann Machines for multi-document text summarization and analyzing how all the approaches perform.