Human evaluation of automatically generated text: Current trends and best practice guidelines

•The current paper provides an overview of human evaluation practices in NLG.•The current paper gives an overview of the steps necessary to undertake a human evaluation study.•Building on findings from NLG, but also statistics and the behavioral sciences, the current paper provides a set of recommen...

Full description

Saved in:

Bibliographic Details
Published in	Computer speech & language Vol. 67; p. 101151
Main Authors	van der Lee, Chris, Gatt, Albert, van Miltenburg, Emiel, Krahmer, Emiel
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.05.2021
Subjects	Ethics Human evaluation Literature review Natural Language Generation Open science Recommendations Recommendations Literature review Ethics Natural Language Generation Open science Human evaluation
Online Access	Get full text
ISSN	0885-2308 1095-8363 1095-8363
DOI	10.1016/j.csl.2020.101151

Cover

More Information
Summary:	•The current paper provides an overview of human evaluation practices in NLG.•The current paper gives an overview of the steps necessary to undertake a human evaluation study.•Building on findings from NLG, but also statistics and the behavioral sciences, the current paper provides a set of recommendations and best practices for human evaluation in NLG. Currently, there is little agreement as to how Natural Language Generation (NLG) systems should be evaluated, with a particularly high degree of variation in the way that human evaluation is carried out. This paper provides an overview of how (mostly intrinsic) human evaluation is currently conducted and presents a set of best practices, grounded in the literature. These best practices are also linked to the stages that researchers go through when conducting an evaluation research (planning stage; execution and release stage), and the specific steps in these stages. With this paper, we hope to contribute to the quality and consistency of human evaluations in NLG.
ISSN:	0885-2308 1095-8363 1095-8363
DOI:	10.1016/j.csl.2020.101151