A Primer on Seq2Seq Models for Generative Chatbots

The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-pr...

Full description

Saved in:
Bibliographic Details
Published inACM computing surveys Vol. 56; no. 3; pp. 1 - 58
Main Authors Scotti, Vincenzo, Sbattella, Licia, Tedesco, Roberto
Format Journal Article
LanguageEnglish
Published New York, NY ACM 31.03.2024
Association for Computing Machinery
Subjects
Online AccessGet full text
ISSN0360-0300
1557-7341
1557-7341
DOI10.1145/3604281

Cover

More Information
Summary:The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks –such as tokenisation or POS tagging– to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allow to directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0360-0300
1557-7341
1557-7341
DOI:10.1145/3604281