Sparse Generative Topographic Mapping for Both Data Visualization and Clustering

To achieve simultaneous data visualization and clustering, the method of sparse generative topographic mapping (SGTM) is developed by modifying the conventional GTM algorithm. While the weight of each grid point is constant in the original GTM, it becomes a variable in the proposed SGTM, enabling da...

Full description

Saved in:
Bibliographic Details
Published inJournal of Chemical Information and Modeling Vol. 58; no. 12; pp. 2528 - 2535
Main Author Kaneko, Hiromasa
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 24.12.2018
American Chemical Society (ACS)
Subjects
Online AccessGet full text
ISSN1549-9596
1549-960X
1549-960X
DOI10.1021/acs.jcim.8b00528

Cover

More Information
Summary:To achieve simultaneous data visualization and clustering, the method of sparse generative topographic mapping (SGTM) is developed by modifying the conventional GTM algorithm. While the weight of each grid point is constant in the original GTM, it becomes a variable in the proposed SGTM, enabling data points to be clustered on two-dimensional maps. The appropriate number of clusters is determined by optimization based on the Bayesian information criterion. Analysis of numerical simulation data sets along with quantitative structure–property relationship and quantitative structure–activity relationship data sets confirmed that the proposed SGTM provides the same degree of visualization performance as the original GTM and clusters data points appropriately. Python and MATLAB codes for the proposed algorithm are available at https://github.com/hkaneko1985/gtm-generativetopographicmapping.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1549-9596
1549-960X
1549-960X
DOI:10.1021/acs.jcim.8b00528