新闻专题阶段性摘要的生成研究

新闻专题的阶段性摘要对了解专题的动态演化、勾勒专题的发展轮廓等都能提供较高的参考价值,一定程度上弥补了专题内容太多而不便阅读的缺憾。以"马航MH370航班失联"专题为研究对象,探讨了新闻专题阶段性摘要的生成算法。首先利用主题抽取技术对各新闻文档进行主题抽取,完成文档集到主题集的转换;然后借助话题检测与追踪技术对主题集进行基于时间流的双向聚类和正逆向结果交集的再聚类;最终根据话题聚类的结果选择对应文档的主题生成新闻专题的阶段性摘要。实验证明,该方法能够取得较好的Rouge召回率。...

Full description

Saved in:
Bibliographic Details
Published in计算机应用研究 Vol. 33; no. 4; pp. 973 - 978
Main Author 尤建清 张仰森
Format Journal Article
LanguageChinese
Published 北京信息科技大学 计算中心,北京 100192%北京信息科技大学 智能信息处理研究所,北京,100192 2016
北京信息科技大学 智能信息处理研究所,北京 100192
Subjects
Online AccessGet full text
ISSN1001-3695
DOI10.3969/j.issn.1001-3695.2016.04.003

Cover

More Information
Summary:新闻专题的阶段性摘要对了解专题的动态演化、勾勒专题的发展轮廓等都能提供较高的参考价值,一定程度上弥补了专题内容太多而不便阅读的缺憾。以"马航MH370航班失联"专题为研究对象,探讨了新闻专题阶段性摘要的生成算法。首先利用主题抽取技术对各新闻文档进行主题抽取,完成文档集到主题集的转换;然后借助话题检测与追踪技术对主题集进行基于时间流的双向聚类和正逆向结果交集的再聚类;最终根据话题聚类的结果选择对应文档的主题生成新闻专题的阶段性摘要。实验证明,该方法能够取得较好的Rouge召回率。
Bibliography:51-1196/TP
Stage summarization for news special topic could provide more reference information for investigating the dynamic evolution and drawing the development outline about the topic,and to some extent it could overcome the drawback that there were so many topic-related reports which were inconvenient to browse for readers. Taking "Missing Malaysia Airlines flight MH370"as an example,this paper discussed an algorithm for stage summarization generated from news special topic. After the first step of subject extraction for each news text,it converted the stage document set to its corresponding subject set. Then it put forward bidirectional cluster method based on time stream to find diverse topics in the latter set and used the intersections between bidirectional categories to cluster again to obtain some important information. Finally it chose those subjects according to all cluster results to constitute the stage summarization. Contrast experiments show that the algorithm receives better Rouge recall rate.
ISSN:1001-3695
DOI:10.3969/j.issn.1001-3695.2016.04.003