基于条件随机域模型的比较要素抽取研究

随着主观性评价文本数量的不断增长,文本情感分析已经成为众多研究者关注的对象.比较要素抽取是比较句情感分析的重要研究任务之一,比较句的情感分析结果与比较要素相结合才更有意义.为了提高比较要素抽取的性能,本文提出在构建系统模型的过程中引入浅层句法信息、比较词候选信息和启发式位置信息等多种语言学相关特征,并且在不增加领域知识的情况下,有效提高系统的准确率和F1值,同时本文提出的方法可以有效处理含有多个比较关系的句子.实验结果表明,将本文提出的特征应用于条件随机域(Conditional random fields,CRFs)模型可以有效提高比较要素抽取的各项性能指标,同时,将本文的实验结果与2012...

Full description

Saved in:
Bibliographic Details
Published in自动化学报 Vol. 41; no. 8; pp. 1385 - 1393
Main Author 王巍 赵铁军 辛国栋 徐永东
Format Journal Article
LanguageChinese
Published 哈尔滨工业大学计算机科学与技术学院机器智能与翻译研究室 哈尔滨 150001%哈尔滨工业大学计算机科学与技术学院 哈尔滨 150001 2015
Subjects
Online AccessGet full text
ISSN0254-4156
1874-1029
DOI10.16383/j.aas.2015.c140762

Cover

More Information
Summary:随着主观性评价文本数量的不断增长,文本情感分析已经成为众多研究者关注的对象.比较要素抽取是比较句情感分析的重要研究任务之一,比较句的情感分析结果与比较要素相结合才更有意义.为了提高比较要素抽取的性能,本文提出在构建系统模型的过程中引入浅层句法信息、比较词候选信息和启发式位置信息等多种语言学相关特征,并且在不增加领域知识的情况下,有效提高系统的准确率和F1值,同时本文提出的方法可以有效处理含有多个比较关系的句子.实验结果表明,将本文提出的特征应用于条件随机域(Conditional random fields,CRFs)模型可以有效提高比较要素抽取的各项性能指标,同时,将本文的实验结果与2012年中文情感分析评测结果的最大值进行了比较,各项指标均超过最大值,进一步证明了本文方法的有效性.
Bibliography:With the rapid growth of the number of evaluative texts on the Web, sentiment analysis has attracted the attention of researchers all over the world. Extraction of comparative elements is one of the important tasks for sentiment analysis of comparative sentences. It is more meaningful that results of sentiment analysis combine with comparative elements. To improve the performance of comparative elements extraction, this paper proposes to introduce shallow parsing features, comparative word candidates and heuristic position information to conditional random fields (CRFs) for building a system model. The proposed method is not only free from introducing domain knowledge but also can effectively deal with sentences containing a few comparative relationships. Experiment results show that the performance of system is improved when introducing proposed features to the CRFs model. Meanwhile, compared with the best results of the 2012 Chinese opinion analysis evaluation, the Fl-scores of the proposed method are highe
ISSN:0254-4156
1874-1029
DOI:10.16383/j.aas.2015.c140762