Extractive summarization using supervised and unsupervised learning

•We combine supervised and unsupervised learning for summarization.•We summarize document using statistical features and relationship between sentences.•We verify that priori knowledge can improve the summarization results.•The combination results are better than single supervised and unsupervised m...

Full description

Saved in:

Bibliographic Details
Published in	Expert systems with applications Vol. 133; pp. 173 - 181
Main Authors	Mao, Xiangke, Yang, Hui, Huang, Shaobin, Liu, Ye, Li, Rongsheng
Format	Journal Article
Language	English
Published	New York Elsevier Ltd 01.11.2019 Elsevier BV
Subjects	Biased-LexRank Combination Evaluation Measurement methods Methods Summarization Supervised learning Unsupervised learning Biased-LexRank Supervised learning Combination Unsupervised learning Summarization
Online Access	Get full text
ISSN	0957-4174 1873-6793
DOI	10.1016/j.eswa.2019.05.011

Cover

More Information
Summary:	•We combine supervised and unsupervised learning for summarization.•We summarize document using statistical features and relationship between sentences.•We verify that priori knowledge can improve the summarization results.•The combination results are better than single supervised and unsupervised method. In this paper, three methods of extracting single document summary by combining supervised learning with unsupervised learning are proposed. The purpose of these three methods is to measure the importance of sentences by combining the statistical features of sentences and the relationship between sentences at the same time. The first method uses supervised model and graph model to score sentences separately, and then linear combination of scores is used as the final score of sentences. In the second method, the graph model is used as an independent feature of the supervised model to evaluate the importance of sentences. The third method is to score the importance of sentences by supervised model, then as a priori value of nodes in the graph model, and finally use biased graph model to score sentences. On the data sets of DUC2001 and DUC2002, the ROUGE method is used as the evaluation criterion, which shows that the three methods have achieved good results, and are superior to the methods of extracting summary only using supervised learning or unsupervised learning. We also validate that priori knowledge can improve the accuracy of key sentence selection in graph model.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2019.05.011