Improving Skip-Gram Embeddings Using BERT
Contextualized embeddings such as BERT and GPT have been shown to give significant improvement in NLP tasks. On the other hand, static embeddings such as skip-gram and GloVe still have desirable characteristics such as low computational cost, easy deployment and freedom from severe contextualized va...
Saved in:
| Published in | IEEE/ACM transactions on audio, speech, and language processing Vol. 29; pp. 1318 - 1328 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2329-9290 2329-9304 |
| DOI | 10.1109/TASLP.2021.3065201 |
Cover
| Summary: | Contextualized embeddings such as BERT and GPT have been shown to give significant improvement in NLP tasks. On the other hand, static embeddings such as skip-gram and GloVe still have desirable characteristics such as low computational cost, easy deployment and freedom from severe contextualized variation in representation. There has been some recent attempt enhancing the skip-gram model by adding syntactic information of context using GCN. We investigate the use of BERT embeddings instead for stronger context representation, which contains not only syntactic and surface features, but also rich knowledge from large-scale pre-training. Results show that BERT-enhanced skip-gram embeddings outperform GCN-enhanced embeddings on a range of tasks. Such embeddings also outperform recent effort distilling BERT embeddings into context-independent vectors. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2329-9290 2329-9304 |
| DOI: | 10.1109/TASLP.2021.3065201 |