Network-Based Clustering and Embedding for High-Dimensional Data Visualization

We present a novel method to visualize high-dimensional dataset as a landscape. The goal is to provide clear and compact representation to reveal the structure of high-dimensional datasets in a way that the size and distinctiveness of clusters can be easily discerned, and the relationships among sin...

Full description

Saved in:
Bibliographic Details
Published in2013 International Conference on Computer-Aided Design and Computer Graphics pp. 290 - 297
Main Authors Hengyuan Zhang, Xiaowu Chen
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2013
Subjects
Online AccessGet full text
DOI10.1109/CADGraphics.2013.45

Cover

Abstract We present a novel method to visualize high-dimensional dataset as a landscape. The goal is to provide clear and compact representation to reveal the structure of high-dimensional datasets in a way that the size and distinctiveness of clusters can be easily discerned, and the relationships among single points can be preserved. Our method is network-based, and consists of two main steps: clustering and embedding. First of all, the similarity graph of high-dimensional dataset is constructed based on the Euclidean distances between data points. For clustering, we propose a new network community detection algorithm to calculate the membership-degree of each vertex belonging to each community. For embedding, we bring forward a practical algorithm to obtain an evenly distributed and regularly shaped layout of data points, in a way that the original relationships among single points are preserved. Finally, the landscape-like visualization is produced by assigning altitudes to data points according to their membership-degrees and by inserting control points. In our high-dimensional data visualization, clusters form highlands, and border data points among clusters show up as valleys. The area and altitude of highland indicate the size and distinctiveness of data cluster respectively.
AbstractList We present a novel method to visualize high-dimensional dataset as a landscape. The goal is to provide clear and compact representation to reveal the structure of high-dimensional datasets in a way that the size and distinctiveness of clusters can be easily discerned, and the relationships among single points can be preserved. Our method is network-based, and consists of two main steps: clustering and embedding. First of all, the similarity graph of high-dimensional dataset is constructed based on the Euclidean distances between data points. For clustering, we propose a new network community detection algorithm to calculate the membership-degree of each vertex belonging to each community. For embedding, we bring forward a practical algorithm to obtain an evenly distributed and regularly shaped layout of data points, in a way that the original relationships among single points are preserved. Finally, the landscape-like visualization is produced by assigning altitudes to data points according to their membership-degrees and by inserting control points. In our high-dimensional data visualization, clusters form highlands, and border data points among clusters show up as valleys. The area and altitude of highland indicate the size and distinctiveness of data cluster respectively.
Author Xiaowu Chen
Hengyuan Zhang
Author_xml – sequence: 1
  surname: Hengyuan Zhang
  fullname: Hengyuan Zhang
  email: zhanghy@vrlab.buaa.edu.cn
  organization: Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing, China
– sequence: 2
  surname: Xiaowu Chen
  fullname: Xiaowu Chen
  email: chen@vrlab.buaa.edu.cn
  organization: Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing, China
BookMark eNotjMtOwkAUQMdEEwX5Ajb9gdZ5dR5LLAgmBDfqltx27sDEdko6JUa_XoyuTs5ZnAm5jn1EQuaMFoxR-1AtlusBTsfQpIJTJgpZXpEJk9paXmolb8kspVBTri7CjLojux2On_3wkT9CQpdV7TmNOIR4yCC6bNXV6Nyv-X7INuFwzJehw5hCH6HNljBC9h7SGdrwDeMl3pMbD23C2T-n5O1p9Vpt8u3L-rlabPPAqRlzro12XBnHfWk1NdCAACMs1FyqxptG-lo0THluveLWSacbp72vEb1WHsWUzP--ARH3pyF0MHztlWElpUb8AJUoUXo
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CADGraphics.2013.45
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1479925764
9781479925766
EndPage 297
ExternalDocumentID 6815008
Genre orig-research
GroupedDBID 6IE
6IL
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIB
RIC
RIE
RIL
ID FETCH-LOGICAL-i208t-2787d268d2f59708aca3a839ab246cf8c4fb3c16f29f629d4d7cd7ffbeef76fe3
IEDL.DBID RIE
IngestDate Wed Dec 20 05:19:01 EST 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i208t-2787d268d2f59708aca3a839ab246cf8c4fb3c16f29f629d4d7cd7ffbeef76fe3
PageCount 8
ParticipantIDs ieee_primary_6815008
PublicationCentury 2000
PublicationDate 20131101
PublicationDateYYYYMMDD 2013-11-01
PublicationDate_xml – month: 11
  year: 2013
  text: 20131101
  day: 01
PublicationDecade 2010
PublicationTitle 2013 International Conference on Computer-Aided Design and Computer Graphics
PublicationTitleAbbrev cadgraphics
PublicationYear 2013
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib026764186
Score 1.528255
Snippet We present a novel method to visualize high-dimensional dataset as a landscape. The goal is to provide clear and compact representation to reveal the structure...
SourceID ieee
SourceType Publisher
StartPage 290
SubjectTerms Clustering algorithms
Communities
Data visualization
Embedding
Fuzzy clustering
High-dimensional data
Network
Partitioning algorithms
Shape
Social network services
Springs
Visualization
Title Network-Based Clustering and Embedding for High-Dimensional Data Visualization
URI https://ieeexplore.ieee.org/document/6815008
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3NS8MwGMbD3MmTyiZ-k4NH07VJlqZX9-EQNjw42W3kE4baibYX_3rzNtsU8eCllLbQNml5nqbv7wlC11JzVxRKkX4WFtwwTjTvZ8QzLUFAdKoATp7OxGTO7xf9RQvd7FgY51xTfOYSWG3-5du1qWGorCdksC9A9u7lUkRWa_vsUJELnkmxCRbK0qIXbuUOQp9XBmK5M5YAsvRjCpVGQcYHaLo9dywceU7qSifm81cs438v7hB1v1k9_LBToSPUcmUHzWaxupvcBpGyePBSQx5C2I9VafHoVTsLR-PgWDFUepAhhPzHgA48VJXCT6sPwC0jpNlF8_HocTAhm5kTyIqmsiI0vIaWCmmpDx8MqVRGMRWskNKUC-Ol4V4zkwlPCy9oYbnNjc291875XHjHjlG7XJfuBGEltHHaFowbyTkzyog8o56r4F3yYCZOUQfaYvkWwzGWm2Y4-3vzOdqHrogw3wVqV--1uwyqXumrpju_AEitpRE
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3NT4MwGMabRQ96UrMZv-3Bo2WjvJRydR9O3YiHzey29DNZVGYULv71trBNYzx4IQRIgBbyPJT39xShKy7BpKkQJA7dAlQEREIcEhtJ7gVEdoSHk8cZG07hfhbPGuh6w8IYY6riMxP41epfvl6q0g-VtRl39sWTvdsxAMQ1rbV-eihLGIScraKFwk7adjdz62OfF8oHc4dR4KGlH5OoVBoy2EPj9dnr0pHnoCxkoD5_BTP-9_L2Ueub1sOPGx06QA2TN1GW1fXd5MbJlMbdl9InIrj9WOQa91-l0f5o7Dwr9rUepOdj_uuIDtwThcBPiw8PXNaYZgtNB_1Jd0hWcyeQBe3wglD3ImrKuKbWfTJ0uFAiEs4MCUmBKcsVWBmpkFmaWkZTDTpROrFWGmMTZk10iLbyZW6OEBZMKiN1GoHiAJESiiUhtSCce0mcnThGTd8W87c6HmO-aoaTvzdfop3hZDyaj-6yh1O067ulRvvO0FbxXppzp_GFvKi69gsbHqhe
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2013+International+Conference+on+Computer-Aided+Design+and+Computer+Graphics&rft.atitle=Network-Based+Clustering+and+Embedding+for+High-Dimensional+Data+Visualization&rft.au=Hengyuan+Zhang&rft.au=Xiaowu+Chen&rft.date=2013-11-01&rft.pub=IEEE&rft.spage=290&rft.epage=297&rft_id=info:doi/10.1109%2FCADGraphics.2013.45&rft.externalDocID=6815008