PSDVec: a Toolbox for Incremental and Scalable Word Embedding
PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite...
Saved in:
| Published in | arXiv.org |
|---|---|
| Main Authors | , , |
| Format | Paper Journal Article |
| Language | English |
| Published |
Ithaca
Cornell University Library, arXiv.org
10.06.2016
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2331-8422 |
| DOI | 10.48550/arxiv.1606.03192 |
Cover
| Abstract | PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite approximation. To scale up the learning process, we implement a blockwise online learning algorithm to learn the embeddings incrementally. This strategy greatly reduces the learning time of word embeddings on a large vocabulary, and can learn the embeddings of new words without re-learning the whole vocabulary. On 9 word similarity/analogy benchmark sets and 2 Natural Language Processing (NLP) tasks, PSDVec produces embeddings that has the best average performance among popular word embedding tools. PSDVec provides a new option for NLP practitioners. |
|---|---|
| AbstractList | PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping
of words in a natural language to continuous vectors which encode the
semantic/syntactic regularities between the words. PSDVec implements a word
embedding learning method based on a weighted low-rank positive semidefinite
approximation. To scale up the learning process, we implement a blockwise
online learning algorithm to learn the embeddings incrementally. This strategy
greatly reduces the learning time of word embeddings on a large vocabulary, and
can learn the embeddings of new words without re-learning the whole vocabulary.
On 9 word similarity/analogy benchmark sets and 2 Natural Language Processing
(NLP) tasks, PSDVec produces embeddings that has the best average performance
among popular word embedding tools. PSDVec provides a new option for NLP
practitioners. PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite approximation. To scale up the learning process, we implement a blockwise online learning algorithm to learn the embeddings incrementally. This strategy greatly reduces the learning time of word embeddings on a large vocabulary, and can learn the embeddings of new words without re-learning the whole vocabulary. On 9 word similarity/analogy benchmark sets and 2 Natural Language Processing (NLP) tasks, PSDVec produces embeddings that has the best average performance among popular word embedding tools. PSDVec provides a new option for NLP practitioners. |
| Author | Zhu, Jun Li, Shaohua Miao, Chunyan |
| Author_xml | – sequence: 1 givenname: Shaohua surname: Li fullname: Li, Shaohua – sequence: 2 givenname: Jun surname: Zhu fullname: Zhu, Jun – sequence: 3 givenname: Chunyan surname: Miao fullname: Miao, Chunyan |
| BackLink | https://doi.org/10.48550/arXiv.1606.03192$$DView paper in arXiv https://doi.org/10.1016/j.neucom.2016.05.093$$DView published paper (Access to full text may be restricted) |
| BookMark | eNotj01LAzEURYMoWGt_gCsDrqe-JJNkRnAhtWqhoNCiy-HlY2TKNKmZVuq_d2xd3c3h3nsuyGmIwRNyxWCcF1LCLaZ98z1mCtQYBCv5CRlwIVhW5Jyfk1HXrQCAK82lFANy_7Z4fPf2jiJdxtiauKd1THQWbPJrH7bYUgyOLiy2aFpPP2JydLo23rkmfF6Ssxrbzo_-c0iWT9Pl5CWbvz7PJg_zDCUXmcyNE7yw4HSJyhWisEaCU1aXmvtaGQZKGHQ-z8H7vGboUCqsrRJQSq3EkFwfaw9u1SY1a0w_1Z9jdXDsiZsjsUnxa-e7bbWKuxT6TxWHfkazQgvxC9TCVGo |
| ContentType | Paper Journal Article |
| Copyright | 2016. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
| Copyright_xml | – notice: 2016. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
| DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKY GOX |
| DOI | 10.48550/arxiv.1606.03192 |
| DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection (subscription) ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One ProQuest Central SciTech Premium Collection ProQuest Engineering Collection Engineering Database (subscription) ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering collection arXiv Computer Science arXiv.org |
| DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) Engineering Collection |
| DatabaseTitleList | Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| EISSN | 2331-8422 |
| ExternalDocumentID | 1606_03192 |
| Genre | Working Paper/Pre-Print |
| GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKY GOX |
| ID | FETCH-LOGICAL-a523-54bd328c0d79a6d838cb50d6c7972ef6b1063bade440ee4f1ada56afc63095763 |
| IEDL.DBID | BENPR |
| IngestDate | Wed Jul 23 00:24:04 EDT 2025 Mon Jun 30 09:36:34 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a523-54bd328c0d79a6d838cb50d6c7972ef6b1063bade440ee4f1ada56afc63095763 |
| Notes | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 |
| OpenAccessLink | https://www.proquest.com/docview/2079771873?pq-origsite=%requestingapplication%&accountid=15518 |
| PQID | 2079771873 |
| PQPubID | 2050157 |
| ParticipantIDs | arxiv_primary_1606_03192 proquest_journals_2079771873 |
| PublicationCentury | 2000 |
| PublicationDate | 20160610 2016-06-10 |
| PublicationDateYYYYMMDD | 2016-06-10 |
| PublicationDate_xml | – month: 06 year: 2016 text: 20160610 day: 10 |
| PublicationDecade | 2010 |
| PublicationPlace | Ithaca |
| PublicationPlace_xml | – name: Ithaca |
| PublicationTitle | arXiv.org |
| PublicationYear | 2016 |
| Publisher | Cornell University Library, arXiv.org |
| Publisher_xml | – name: Cornell University Library, arXiv.org |
| SSID | ssj0002672553 |
| Score | 1.596738 |
| SecondaryResourceType | preprint |
| Snippet | PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the... PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the... |
| SourceID | arxiv proquest |
| SourceType | Open Access Repository Aggregation Database |
| SubjectTerms | Algorithms Computer Science - Computation and Language Distance learning Embedding Machine learning Mapping Natural language Natural language processing Perl |
| SummonAdditionalLinks | – databaseName: arXiv.org dbid: GOX link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LSwMxEA61Jy-iqLRaJQevwd3NY3cFD6ItRfABrdrbkskDBO2Wtkp_vpPsFg_iNUwgmcxkvkkyXwi58E45z61mKYeECZ1Jpq3SzKvSgy2sd_EzmIdHNX4R9zM56xC6rYXRy837d8MPDKvLVIW7ArQS3GR3ECiEYt6nWXM5Gam4WvlfOcSYsenP1hrjxWif7LVAj940K3NAOm5-SK6fJ3evzlxRTad1_QH1hiJopOikzTEddsDMnk5QcaGkib5hakiHn-BsCDFHZDoaTm_HrP3AgGnM75gUYHlWmMTmpVa24IUBmVhl8jLPnFeA6RgHbZ0QiXPCp9pqqbQ3iiPwQcc_Jt15PXc9Qr3MINegC5QSUJhSeCGskWACy7At-qQXp10tGo6KKmikihrpk8FWE1Vrn6sqS3AMGJZyfvJ_z1Oyi_BAhYdRaTIg3fXyy51hCF7DeVyHHwc4hl8 priority: 102 providerName: Cornell University |
| Title | PSDVec: a Toolbox for Incremental and Scalable Word Embedding |
| URI | https://www.proquest.com/docview/2079771873 https://arxiv.org/abs/1606.03192 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LS8NAEB60RfDmEx9V9uA1miab7VYogtpWBGuxVXsLsy8QalPbKp787c5uWz0IXgJ5XPJtMvPN7Mw3ACfOCutSg1E1VXHEMckiNAIjJ-pOGWmcDcNg7jri5pHfDrLBCnSWvTC-rHJpE4OhNoX2OXKfCSGqUpW19GL8FvmpUX53dTlCAxejFUwjSIytQjnxylglKF82O92Hn6xLImrEodP59mYQ8zrDyefLh0-yiFPf0ZMQSw2X_hjn4HFaG1Du4thONmHFjrZgLRRq6uk2NLq96yerzxmyflEMVfHJiHYy-s3niT4cMhwZ1iPofVMUe6bgkjVflTXeSe1Av9XsX91EixEIEVKEGGVcmTSROja1OgojU6lVFhuhCYvEOqEooEsVGst5bC13VTSYCXRapESdyHTsQmlUjOweMJclqoYKJT3FldR17jg3OlPa6xQbuQ974bXz8VzlIveI5AGRfagskcgXX_g0_12Pg_9vH8I6kQzhy6uqcQVKs8m7PSJHPlPHsCpb7ePFGtFZ-35Ax7uv5jfpyKDJ |
| linkProvider | ProQuest |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9tAEB5Roqq9FUpFKI89tEdTx7ve2JUiJCAovKKopG1u1uxLQgpxmtAWfhz_jdmNDQckbrnali1_szvzzew8AL44K63jBqMWV3EkMEkjNBIjJ3OnTGacDcNgLvuy91OcjdLRCjzUtTA-rbLWiUFRm1L7GLmPhBBVaWVtfjD9E_mpUf50tR6hgdVoBdMJLcaqwo5ze_-fXLh55_SY5P01SU66w6NeVE0ZiJCcsCgVyvAk07Fp5yhNxjOt0thITZ9LrJOKfCau0FghYmuFa6HBVKLTkhM7od1Jr30DDcFFTr5f47DbH_x4CvIksk2UnS9OU0PvsG84u7v-52M6ct8XECVEisOlF7YgGLiTD9AY4NTO1mDFTtbhbcgL1fOP0BlcHf-y-jtDNizLsSrvGLFcRlplEVfEMcOJYVckaV-DxX4TEKx7o6zxNnEDhsvA4hOsTsqJ3QTm0kS1UWFGTwmV6Vw4IYxOlfZtkU3WhM3w28V00VSj8IgUAZEmbNdIFNWGmhfP4t96_fYevOsNLy-Ki9P--Wd4T_xG-syuVrwNq7ezv3aHOMSt2q0kxaBY8tp4BE_G2jE |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PSDVec%3A+a+Toolbox+for+Incremental+and+Scalable+Word+Embedding&rft.jtitle=arXiv.org&rft.au=Li%2C+Shaohua&rft.au=Zhu%2C+Jun&rft.au=Miao%2C+Chunyan&rft.date=2016-06-10&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.1606.03192 |