Clustering and Visualising Documents using Word Embeddings

This lesson uses word embeddings and clustering algorithms in Python to identify groups of similar documents in a corpus of approximately 9,000 academic abstracts. It will teach you the basics of dimensionality reduction for extracting structure from a large corpus and how to evaluate your results.

Saved in:

Bibliographic Details
Published in	The programming historian Vol. 12; no. 12
Main Authors	Reades, Jonathan, Williams, Jennie
Format	Journal Article
Language	English
Published	ProgHist Ltd 09.08.2023 Editorial Board of the Programming Historian
Subjects	Algorithms Automation Bilingualism Clustering Documents Institutional repositories Keywords Natural language processing Sparsity
Online Access	Get full text
ISSN	2397-2068 2397-2068
DOI	10.46430/phen0111

Cover

More Information
Summary:	This lesson uses word embeddings and clustering algorithms in Python to identify groups of similar documents in a corpus of approximately 9,000 academic abstracts. It will teach you the basics of dimensionality reduction for extracting structure from a large corpus and how to evaluate your results.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2397-2068 2397-2068
DOI:	10.46430/phen0111