GREEK-BERT: The Greeks visiting Sesame Street

Transformer-based language models, such as BERT and its variants, have achieved state-of-the-art performance in several downstream natural language processing (NLP) tasks on generic benchmark datasets (e.g., GLUE, SQUAD, RACE). However, these models have mostly been applied to the resource-rich Engl...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Koutsikakis, John, Chalkidis, Ilias, Malakasiotis, Prodromos, Androutsopoulos, Ion
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 03.09.2020
Subjects	Benchmarks Computer Science - Computation and Language English language Language Natural language Natural language processing Performance evaluation Speech recognition Transformers
Online Access	Get full text
ISSN	2331-8422
DOI	10.48550/arxiv.2008.12014

Cover

More Information
Summary:	Transformer-based language models, such as BERT and its variants, have achieved state-of-the-art performance in several downstream natural language processing (NLP) tasks on generic benchmark datasets (e.g., GLUE, SQUAD, RACE). However, these models have mostly been applied to the resource-rich English language. In this paper, we present GREEK-BERT, a monolingual BERT-based language model for modern Greek. We evaluate its performance in three NLP tasks, i.e., part-of-speech tagging, named entity recognition, and natural language inference, obtaining state-of-the-art performance. Interestingly, in two of the benchmarks GREEK-BERT outperforms two multilingual Transformer-based models (M-BERT, XLM-R), as well as shallower neural baselines operating on pre-trained word embeddings, by a large margin (5%-10%). Most importantly, we make both GREEK-BERT and our training code publicly available, along with code illustrating how GREEK-BERT can be fine-tuned for downstream NLP tasks. We expect these resources to boost NLP research and applications for modern Greek.
Bibliography:	SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50
ISSN:	2331-8422
DOI:	10.48550/arxiv.2008.12014