A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons

We consider the problem of clustering a given dataset into k clusters subject to an additional set of constraints on relative distance comparisons between the data items. The additional constraints are meant to reflect side-information that is not expressed in the feature vectors, directly. Relative...

Full description

Saved in:
Bibliographic Details
Published inMachine Learning and Knowledge Discovery in Databases pp. 219 - 234
Main Authors Amid, Ehsan, Gionis, Aristides, Ukkonen, Antti
Format Book Chapter
LanguageEnglish
Published Cham Springer International Publishing 2015
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN3319235273
9783319235271
ISSN0302-9743
1611-3349
DOI10.1007/978-3-319-23528-8_14

Cover

Abstract We consider the problem of clustering a given dataset into k clusters subject to an additional set of constraints on relative distance comparisons between the data items. The additional constraints are meant to reflect side-information that is not expressed in the feature vectors, directly. Relative comparisons can express structures at finer level of detail than must-link (ML) and cannot-link (CL) constraints that are commonly used for semi-supervised clustering. Relative comparisons are particularly useful in settings where giving an ML or a CL constraint is difficult because the granularity of the true clustering is unknown. Our main contribution is an efficient algorithm for learning a kernel matrix using the log determinant divergence (a variant of the Bregman divergence) subject to a set of relative distance constraints. Given the learned kernel matrix, a clustering can be obtained by any suitable algorithm, such as kernel k-means. We show empirically that kernels found by our algorithm yield clusterings of higher quality than existing approaches that either use ML/CL constraints or a different means to implement the supervision using relative comparisons.
AbstractList We consider the problem of clustering a given dataset into k clusters subject to an additional set of constraints on relative distance comparisons between the data items. The additional constraints are meant to reflect side-information that is not expressed in the feature vectors, directly. Relative comparisons can express structures at finer level of detail than must-link (ML) and cannot-link (CL) constraints that are commonly used for semi-supervised clustering. Relative comparisons are particularly useful in settings where giving an ML or a CL constraint is difficult because the granularity of the true clustering is unknown. Our main contribution is an efficient algorithm for learning a kernel matrix using the log determinant divergence (a variant of the Bregman divergence) subject to a set of relative distance constraints. Given the learned kernel matrix, a clustering can be obtained by any suitable algorithm, such as kernel k-means. We show empirically that kernels found by our algorithm yield clusterings of higher quality than existing approaches that either use ML/CL constraints or a different means to implement the supervision using relative comparisons.
Author Amid, Ehsan
Gionis, Aristides
Ukkonen, Antti
Author_xml – sequence: 1
  givenname: Ehsan
  surname: Amid
  fullname: Amid, Ehsan
  email: ehsan.amid@aalto.fi
– sequence: 2
  givenname: Aristides
  surname: Gionis
  fullname: Gionis, Aristides
– sequence: 3
  givenname: Antti
  surname: Ukkonen
  fullname: Ukkonen, Antti
BookMark eNo1kMtOwzAQRQ0Uibb0D1j4BwxjTxLbyyo8RSUkHisWlptOaSB1ojgtv09SYDXSmaurqzNho1AHYuxCwqUE0FdWG4ECpRUKU2WEcTI5YrMeYw8PzByzscykFIiJPWGT_4fGERsDghJWJ3jGJjF-AoDSVo3Z-5w_UhuoEgvybSjDB583TVv7YsO7mr_QthRx11C7LyOteF7tYkftEPsuuw1_psp35Z74dRk7Hwrieb1tfFvGOsRzdrr2VaTZ352yt9ub1_xeLJ7uHvL5QkRpTScyZVIkU8jEgAbwWhUpFj6xqU281WvplTZrsyqkR_TSesi0IZMBkl5aQJwy9dsbm2EZtW5Z11_RSXCDO9dLcuh6G-7gyQ3u8Acu8F8R
ContentType Book Chapter
Copyright Springer International Publishing Switzerland 2015
Copyright_xml – notice: Springer International Publishing Switzerland 2015
DOI 10.1007/978-3-319-23528-8_14
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9783319235288
3319235281
EISSN 1611-3349
Editor Santos Costa, Vítor
Appice, Annalisa
Rodrigues, Pedro Pereira
Soares, Carlos
Jorge, Alípio
Gama, João
Editor_xml – sequence: 1
  givenname: Annalisa
  surname: Appice
  fullname: Appice, Annalisa
  email: annalisa.appice@uniba.it
– sequence: 2
  givenname: Pedro Pereira
  surname: Rodrigues
  fullname: Rodrigues, Pedro Pereira
  email: pprodrigues@med.up.pt
– sequence: 3
  givenname: Vítor
  surname: Santos Costa
  fullname: Santos Costa, Vítor
  email: vsc@dcc.fc.up.pt
– sequence: 4
  givenname: Carlos
  surname: Soares
  fullname: Soares, Carlos
  email: csoares@fe.up.pt
– sequence: 5
  givenname: João
  surname: Gama
  fullname: Gama, João
  email: jgama@fep.up.pt
– sequence: 6
  givenname: Alípio
  surname: Jorge
  fullname: Jorge, Alípio
  email: amjorge@fc.up.pt
EndPage 234
GroupedDBID -DT
-GH
-~X
1SB
29L
2HA
2HV
5QI
875
AASHB
ABMNI
ACGFS
ADCXD
AEFIE
ALMA_UNASSIGNED_HOLDINGS
EJD
F5P
FEDTE
HVGLF
LAS
LDH
P2P
RNI
RSU
SVGTG
VI1
~02
ID FETCH-LOGICAL-s198t-62853e8c1480700a72c53ca49594a97f1a278f8dc1a33a19a0678e8603e7b9033
ISBN 3319235273
9783319235271
ISSN 0302-9743
IngestDate Wed Sep 17 04:53:40 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-s198t-62853e8c1480700a72c53ca49594a97f1a278f8dc1a33a19a0678e8603e7b9033
PageCount 16
ParticipantIDs springer_books_10_1007_978_3_319_23528_8_14
PublicationCentury 2000
PublicationDate 2015
PublicationDateYYYYMMDD 2015-01-01
PublicationDate_xml – year: 2015
  text: 2015
PublicationDecade 2010
PublicationPlace Cham
PublicationPlace_xml – name: Cham
PublicationSeriesSubtitle Lecture Notes in Artificial Intelligence
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSeriesTitleAlternate Lect.Notes Computer
PublicationSubtitle European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part I
PublicationTitle Machine Learning and Knowledge Discovery in Databases
PublicationYear 2015
Publisher Springer International Publishing
Publisher_xml – name: Springer International Publishing
RelatedPersons Tanaka, Yuzuru
Goebel, Randy
Wahlster, Wolfgang
RelatedPersons_xml – sequence: 1
  givenname: Randy
  surname: Goebel
  fullname: Goebel, Randy
– sequence: 2
  givenname: Yuzuru
  surname: Tanaka
  fullname: Tanaka, Yuzuru
– sequence: 3
  givenname: Wolfgang
  surname: Wahlster
  fullname: Wahlster, Wolfgang
SSID ssj0002792
ssj0001558583
Score 1.8678584
Snippet We consider the problem of clustering a given dataset into k clusters subject to an additional set of constraints on relative distance comparisons between the...
SourceID springer
SourceType Publisher
StartPage 219
SubjectTerms Data Item
Kernel Matrix
Pairwise Constraint
Relative Comparison
Spectral Cluster
Title A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons
URI http://link.springer.com/10.1007/978-3-319-23528-8_14
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9QwELWW5YI4UL4ElCIfuK2CktjeOIceonZL1ZZe6KJKHCLHccSq7bZqvBz4PfxQZuI4SdNeyiWKomjtnXkZj8fzZgj5zDRsGxIpgtLwJOAq0kEqsNwd56aqRBFWHMnJ307nh0t-dC7OJ5O_g6yljS2-6D8P8kr-R6vwDPSKLNlHaLb7UXgA96BfuIKG4Tpyfu-GWdsOQ5gGaXyFVEc1PPYhMiyrqTE9s-H17SurcL3qHOjsatWodvFrkJTzFWOzrjtx8-mX_fvLC3AkjS83YFdDpGWzY3O7NpdBN5PspidqfTdXq6De3KBNqjGcfLnB2gxdDNhl4_1uJmwb_sJe1xrRDY-iNPXuSXvacXptmySymW9I4e3TMIARiVEAwwcwRyHQPgp3Z8fLGLqkInZ9WzzzC6w67IucoTTOkM-xPCNz5VBHxtmt87ELot5bQoZZI8jwwtFkIHPslv4EJjAlT7PF0cmPPpInYMsl-_UfSzK6sys3K2QU-VkzV_Op_xcDNudDQ947n2_cnrMt8hypMBQ5KiC0l2Ri1q_ICy932sr9NfmZ0REEqIcAtdd0BAHaQ4AiBKiHAPUQoAMIvCHLg8XZ3mHQdu0I6iiVNkBOLjNSR1isIAxVEmvBtIKNeMpVmlSRihNZyVJHijEVpQr9JSPnITNJkYaMvSXTNUD6HaFRIUBMXKqyCLkuU8XnqYHbxMRcmEK_JzMvnRy_wzr3RbhBljnLQZZ5I8scZfnhUW9vk2c9WD-Sqb3dmB3wP23xqQXAP8LDfzg
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Machine+Learning+and+Knowledge+Discovery+in+Databases&rft.au=Amid%2C+Ehsan&rft.au=Gionis%2C+Aristides&rft.au=Ukkonen%2C+Antti&rft.atitle=A+Kernel-Learning+Approach+to+Semi-supervised+Clustering+with+Relative+Distance+Comparisons&rft.series=Lecture+Notes+in+Computer+Science&rft.date=2015-01-01&rft.pub=Springer+International+Publishing&rft.isbn=9783319235271&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=219&rft.epage=234&rft_id=info:doi/10.1007%2F978-3-319-23528-8_14
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0302-9743&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0302-9743&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0302-9743&client=summon