Design of Text Summarization System Based on Finetuned-BERT and TF-IDF

Publication Date

7-2-2022

Document Type

Conference Proceeding

Publication Title

Signal and Information Processing, Networking and Computers: Proceedings of the 8th International Conference on Signal and Information Processing, Networking and Computers (ICSINC)

Editor

Jiande Sun, Yue Wang, Mengyao Huo, Lexi Xu

DOI

10.1007/978-981-19-3387-5_159

First Page

1330

Last Page

1338

Abstract

In today’s world, information is exploding. Internet readers need to judge the readability of the text by reading abstracts when searching for some information, such as blogs and technology papers, but the current text information is generally lack of abstracts. The development of NLP (Natural Language Processing) technology has a great help to fill the blanks. The purpose of this paper is to propose a text summary system, which has been tested in the corpus (including CLTS and some educational papers) and shown great results in the shorter samples. Our text summary system provides two options for the text without abstracts or with poor original abstracts, which are based on finetune-BERT and TF-IDF respectively.These two strategies are different from processing speed and semantic fit. In addition, in order to apply to more diverse scenarios, this article makes it possible to generate a summary of the image text by incorporating the OCR (Optical Character Recognition) technology.

Keywords

Finetuned-BERT, NLP, TF-IDF

Department

Economics

Share

COinS