DefSent+: Improving sentence embeddings of language models by projecting definition sentences into a quasi-isotropic or isotropic vector space of unlimited dictionary entries

This paper presents a significant improvement on the previous conference paper known as DefSent. The prior study seeks to improve sentence embeddings of language models by projecting definition sentences into the vector space of dictionary entries. We discover that this approach is not fully explore...

Full description

Saved in:

Bibliographic Details
Main Author	Liu, Xiaodong
Format	Journal Article
Language	English
Published	25.05.2024
Subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
Online Access	Get full text
DOI	10.48550/arxiv.2405.16153

Cover

Abstract	This paper presents a significant improvement on the previous conference paper known as DefSent. The prior study seeks to improve sentence embeddings of language models by projecting definition sentences into the vector space of dictionary entries. We discover that this approach is not fully explored due to the methodological limitation of using word embeddings of language models to represent dictionary entries. This leads to two hindrances. First, dictionary entries are constrained by the single-word vocabulary, and thus cannot be fully exploited. Second, semantic representations of language models are known to be anisotropic, but pre-processing word embeddings for DefSent is not allowed because its weight is frozen during training and tied to the prediction layer. In this paper, we propose a novel method to progressively build entry embeddings not subject to the limitations. As a result, definition sentences can be projected into a quasi-isotropic or isotropic vector space of unlimited dictionary entries, so that sentence embeddings of noticeably better quality are attainable. We abbreviate our approach as DefSent+ (a plus version of DefSent), involving the following strengths: 1) the task performance on measuring sentence similarities is significantly improved compared to DefSent; 2) when DefSent+ is used to further train data-augmented models like SIMCSE, SNCSE, and SynCSE, state-of-the-art performance on measuring sentence similarities can be achieved among the approaches without using manually labeled datasets; 3) DefSent+ is also competitive in feature-based transfer for NLP downstream tasks.
AbstractList	This paper presents a significant improvement on the previous conference paper known as DefSent. The prior study seeks to improve sentence embeddings of language models by projecting definition sentences into the vector space of dictionary entries. We discover that this approach is not fully explored due to the methodological limitation of using word embeddings of language models to represent dictionary entries. This leads to two hindrances. First, dictionary entries are constrained by the single-word vocabulary, and thus cannot be fully exploited. Second, semantic representations of language models are known to be anisotropic, but pre-processing word embeddings for DefSent is not allowed because its weight is frozen during training and tied to the prediction layer. In this paper, we propose a novel method to progressively build entry embeddings not subject to the limitations. As a result, definition sentences can be projected into a quasi-isotropic or isotropic vector space of unlimited dictionary entries, so that sentence embeddings of noticeably better quality are attainable. We abbreviate our approach as DefSent+ (a plus version of DefSent), involving the following strengths: 1) the task performance on measuring sentence similarities is significantly improved compared to DefSent; 2) when DefSent+ is used to further train data-augmented models like SIMCSE, SNCSE, and SynCSE, state-of-the-art performance on measuring sentence similarities can be achieved among the approaches without using manually labeled datasets; 3) DefSent+ is also competitive in feature-based transfer for NLP downstream tasks.
Author	Liu, Xiaodong
Author_xml	– sequence: 1 givenname: Xiaodong surname: Liu fullname: Liu, Xiaodong
BackLink	https://doi.org/10.48550/arXiv.2405.16153$$DView paper in arXiv
BookMark	eNqFj7tOxEAMRaeAgtcHUOEebUjYDUK0PAQ19NFsxomMMnYYTyL2p_hGnBWCksry1X3oHLsDFkbnzquy2NzWdXnl0yfNxfWmrIvqpqrXR-7rAbtX5Hx5By9xTDIT96AmILcIGLcYgkkK0sHguZ98jxAl4KCw3YEl3rHNSyhgR0yZhH_zCsRZwMPH5JVWpJKTjNSCJPh7ZiswQUdvizYz8UCRMgYI1C59Pu3AGhOhnrrDzg-KZz_3xF08Pb7dP6_2ZM2YKJq7WQibPeH6f8c3J21i0g
ContentType	Journal Article
Copyright	http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml	– notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID	AKY GOX
DOI	10.48550/arxiv.2405.16153
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2405_16153
GroupedDBID	AKY GOX
ID	FETCH-arxiv_primary_2405_161533
IEDL.DBID	GOX
IngestDate	Tue Jul 22 23:09:19 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-arxiv_primary_2405_161533
OpenAccessLink	https://arxiv.org/abs/2405.16153
ParticipantIDs	arxiv_primary_2405_16153
PublicationCentury	2000
PublicationDate	2024-05-25
PublicationDateYYYYMMDD	2024-05-25
PublicationDate_xml	– month: 05 year: 2024 text: 2024-05-25 day: 25
PublicationDecade	2020
PublicationYear	2024
Score	3.7519484
SecondaryResourceType	preprint
Snippet	This paper presents a significant improvement on the previous conference paper known as DefSent. The prior study seeks to improve sentence embeddings of...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
Title	DefSent+: Improving sentence embeddings of language models by projecting definition sentences into a quasi-isotropic or isotropic vector space of unlimited dictionary entries
URI	https://arxiv.org/abs/2405.16153
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV3JTsMwELVKT1wQCFDZ58A1QBPbibkhoFRIwAGQcovsxpEsQdxmqehP8Y14SVsuPXqX5WXezPiNEbo0KDXCgtOACsICPGFRIGRIgzgyUI4MEx45utjLKx1_4ueUpD0ESy4Mr37U3McHFvW1ETfkymGSLctVtLv26S31zkkXiqurv65nMKbL-ickRrtop0N3cOeXYw_1ZLmPfh9k8W5d7rewUuHBcn7soQL5LWTuHECgC1iaD8H9UFODWEBnK7GNclmo0r2xWrWvQZWNBg6zltcqULVuKj1VE9AVrBNzZ5sHc3uYEc0wbfnliU2QK0ds4NUCrJ3X6M0H6GL0-HE_DtwMs6kPR5HZyWceuB2ifqlLOUBgFKGYsvwmZpRhmcQJEyFLOM6HnFAs5REabOrleHPRCdoOjVC33vOQnKJ-U7XyzAjlRpy7lfkDOWGVJw
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DefSent%2B%3A+Improving+sentence+embeddings+of+language+models+by+projecting+definition+sentences+into+a+quasi-isotropic+or+isotropic+vector+space+of+unlimited+dictionary+entries&rft.au=Liu%2C+Xiaodong&rft.date=2024-05-25&rft_id=info:doi/10.48550%2Farxiv.2405.16153&rft.externalDocID=2405_16153