Design of Text Summarization System Based on Finetuned-BERT and TF-IDF
Publication Date
7-2-2022
Document Type
Conference Proceeding
Publication Title
Signal and Information Processing, Networking and Computers: Proceedings of the 8th International Conference on Signal and Information Processing, Networking and Computers (ICSINC)
Editor
Jiande Sun, Yue Wang, Mengyao Huo, Lexi Xu
DOI
10.1007/978-981-19-3387-5_159
First Page
1330
Last Page
1338
Abstract
In today’s world, information is exploding. Internet readers need to judge the readability of the text by reading abstracts when searching for some information, such as blogs and technology papers, but the current text information is generally lack of abstracts. The development of NLP (Natural Language Processing) technology has a great help to fill the blanks. The purpose of this paper is to propose a text summary system, which has been tested in the corpus (including CLTS and some educational papers) and shown great results in the shorter samples. Our text summary system provides two options for the text without abstracts or with poor original abstracts, which are based on finetune-BERT and TF-IDF respectively.These two strategies are different from processing speed and semantic fit. In addition, in order to apply to more diverse scenarios, this article makes it possible to generate a summary of the image text by incorporating the OCR (Optical Character Recognition) technology.
Keywords
Finetuned-BERT, NLP, TF-IDF
Department
Economics
Recommended Citation
Yuyang Liu, Kevin Song, Jin He, Songlin Sun, and Rui Liu. "Design of Text Summarization System Based on Finetuned-BERT and TF-IDF" Signal and Information Processing, Networking and Computers: Proceedings of the 8th International Conference on Signal and Information Processing, Networking and Computers (ICSINC) (2022): 1330-1338. https://doi.org/10.1007/978-981-19-3387-5_159