Répétitions et variations des textes générés Une analyse linguistique basée sur un corpus d’articles financiers rédigés en français

Texts automatically generated in French by commercial software have not been the subject of in-depth linguistic analyses until now, even though they are becoming increasingly common, especially in the media. The questions we are interested in concern their 'quality', in particular their re...

Full description

Saved in:
Bibliographic Details
Published inChimera (Madrid) Vol. 8; pp. 79 - 108
Main Author De Cesare, Anna-Maria
Format Journal Article
LanguageEnglish
Published 21.04.2022
Online AccessGet full text
ISSN2386-2629
2386-2629
DOI10.15366/chimera2021.8.004

Cover

More Information
Summary:Texts automatically generated in French by commercial software have not been the subject of in-depth linguistic analyses until now, even though they are becoming increasingly common, especially in the media. The questions we are interested in concern their 'quality', in particular their repetitiveness and specificity compared to non-generated texts. The paper is organized as follows: After defining the theoretical concepts able to describe the specificities of generated texts and presenting our work corpus (composed of 100 articles produced by the CAC40 software in the field of finance), we show that the similarity of generated texts does not only concern the lexicon, but can also be traced in less obvious properties, such as their macro- and micro-structuring and their information structural properties. Our conclusion is that the high repetitiveness of texts generated by the CAC40 software is not in itself a problem because it becomes obvious only when many texts are compared. Taken individually, each generated text shows a sufficiently rich internal variation to be natural.
ISSN:2386-2629
2386-2629
DOI:10.15366/chimera2021.8.004