k-ATTRACTORS: A PARTITIONAL CLUSTERING ALGORITHM FOR NUMERIC DATA ANALYSIS
Clustering is a data analysis technique, particularly useful when there are many dimensions and little prior information about the data. Partitional clustering algorithms are efficient but suffer from sensitivity to the initial partition and noise. We propose here k-attractors, a partitional cluster...
Saved in:
| Published in | Applied artificial intelligence Vol. 25; no. 2; pp. 97 - 115 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Philadelphia
Taylor & Francis Group
28.02.2011
Taylor & Francis Ltd |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0883-9514 1087-6545 |
| DOI | 10.1080/08839514.2011.534590 |
Cover
| Abstract | Clustering is a data analysis technique, particularly useful when there are many dimensions and little prior information about the data. Partitional clustering algorithms are efficient but suffer from sensitivity to the initial partition and noise. We propose here k-attractors, a partitional clustering algorithm tailored to numeric data analysis. As a preprocessing (initialization) step, it uses maximal frequent item-set discovery and partitioning to define the number of clusters k and the initial cluster "attractors." During its main phase the algorithm uses a distance measure, which is adapted with high precision to the way initial attractors are determined. We applied k-attractors as well as k-means, EM, and FarthestFirst clustering algorithms to several datasets and compared results. Comparison favored k-attractors in terms of convergence speed and cluster formation quality in most cases, as it outperforms these three algorithms except from cases of datasets with very small cardinality containing only a few frequent item sets. On the downside, its initialization phase adds an overhead that can be deemed acceptable only when it contributes significantly to the algorithm's accuracy. |
|---|---|
| AbstractList | Clustering is a data analysis technique, particularly useful when there are many dimensions and little prior information about the data. Partitional clustering algorithms are efficient but suffer from sensitivity to the initial partition and noise. We propose here k-attractors, a partitional clustering algorithm tailored to numeric data analysis. As a preprocessing (initialization) step, it uses maximal frequent item-set discovery and partitioning to define the number of clusters k and the initial cluster "attractors." During its main phase the algorithm uses a distance measure, which is adapted with high precision to the way initial attractors are determined. We applied k-attractors as well as k-means, EM, and FarthestFirst clustering algorithms to several datasets and compared results. Comparison favored k-attractors in terms of convergence speed and cluster formation quality in most cases, as it outperforms these three algorithms except from cases of datasets with very small cardinality containing only a few frequent item sets. On the downside, its initialization phase adds an overhead that can be deemed acceptable only when it contributes significantly to the algorithm's accuracy. Clustering is a data analysis technique, particularly useful when there are many dimensions and little prior information about the data. Partitional clustering algorithms are efficient but suffer from sensitivity to the initial partition and noise. We propose here k-attractors, a partitional clustering algorithm tailored to numeric data analysis. As a preprocessing (initialization) step, it uses maximal frequent item-set discovery and partitioning to define the number of clusters k and the initial cluster "attractors." During its main phase the algorithm uses a distance measure, which is adapted with high precision to the way initial attractors are determined. We applied k-attractors as well as k-means, EM, and FarthestFirst clustering algorithms to several datasets and compared results. Comparison favored k-attractors in terms of convergence speed and cluster formation quality in most cases, as it outperforms these three algorithms except from cases of datasets with very small cardinality containing only a few frequent item sets. On the downside, its initialization phase adds an overhead that can be deemed acceptable only when it contributes significantly to the algorithm's accuracy. [PUBLICATION ABSTRACT] |
| Author | Tjortjis, C. Antonellis, P. Tsirakis, N. Kanellopoulos, Y. Makris, C. |
| Author_xml | – sequence: 1 givenname: Y. surname: Kanellopoulos fullname: Kanellopoulos, Y. email: y.kanellopoulos@sig.eu organization: Software Improvement Group – sequence: 2 givenname: P. surname: Antonellis fullname: Antonellis, P. organization: University of Patras – sequence: 3 givenname: C. surname: Tjortjis fullname: Tjortjis, C. organization: University of Ioannina and University of Western Macedonia – sequence: 4 givenname: C. surname: Makris fullname: Makris, C. organization: University of Patras – sequence: 5 givenname: N. surname: Tsirakis fullname: Tsirakis, N. organization: University of Patras |
| BookMark | eNqFkL1OwzAUhS0EEuXnDRgiFqYUu7GTmAVZAUpQaFDqDkyW4zhSII3BToV4e1wVFgbQHa5073eOjs4R2B_MoAE4Q3CKYAovYZpGlCA8nUGEpiTChMI9MPG_JIwJJvtgskXCLXMIjpx7gRCiJEET8PAaMs4rlvGyWl4FLHhiFc95Xi5YEWTFaslvq3wxD1gxL6uc3z8Gd2UVLFaP_pwFN4yzgHn0eZkvT8BBK3unT7_3MVjd3fLsPizKeZ6xIlRRHI9hreMaRjPS1NBPhGdIwzqtJcZUa9xoCmOSNKolSiUK1z4oxo3UUlMqUSxJdAwudr5v1rxvtBvFunNK970ctNk4kcY0oRRD6snzX-SL2djBhxMpoRHBiGIP4R2krHHO6la82W4t7adAUGzrFT_1im29Ylevl139kqlulGNnhtHKrv9PfL0Td0Nr7Fp-GNs3YpSfvbGtlYPqnIj-dPgCrDeNrQ |
| CitedBy_id | crossref_primary_10_1016_j_eswa_2012_07_016 crossref_primary_10_1016_j_is_2020_101562 crossref_primary_10_3390_a14080242 crossref_primary_10_1007_s00607_019_00739_y crossref_primary_10_1007_s11277_019_06709_z crossref_primary_10_1016_j_appet_2021_105236 |
| Cites_doi | 10.1145/331499.331504 10.1007/3-540-48412-4_4 10.1109/TKDE.2007.1066 10.1109/TKDE.2006.106 10.1109/TKDE.2007.1048 10.1109/TKDE.2005.75 10.1006/jpdc.1997.1404 10.1137/1.9781611972733.6 10.1109/ICTAI.2007.31 10.1145/319950.320054 |
| ContentType | Journal Article |
| Copyright | Copyright Taylor & Francis Group, LLC 2011 Copyright Taylor & Francis Ltd. Feb 2011 |
| Copyright_xml | – notice: Copyright Taylor & Francis Group, LLC 2011 – notice: Copyright Taylor & Francis Ltd. Feb 2011 |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D F28 FR3 |
| DOI | 10.1080/08839514.2011.534590 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
| DatabaseTitleList | Technology Research Database Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1087-6545 |
| EndPage | 115 |
| ExternalDocumentID | 2307101911 10_1080_08839514_2011_534590 534590 |
| GroupedDBID | .4S .7F .DC .QJ 0YH 23M 2DF 30N 4.4 5GY 5VS 8VB AAENE AAFWJ AAJMT ABCCY ABDBF ABFIM ABHAV ABIVO ABPEM ABTAI ACGEJ ACGFS ACGOD ACNCT ACTIO ACUHS ADCVX ADMLS ADXPE AEISY AEMOZ AENEX AEOZL AEPSL AEYOC AFKVX AFPKN AGMYJ AHQJS AIJEM AIYEW AJWEG AKVCP ALMA_UNASSIGNED_HOLDINGS ALQZU AQRUH AQTUD ARCSS AVBZW AWYRJ BLEHA CAG CCCUG CE4 COF CS3 DGEBU DKSSO EAP EBR EBS EBU ECS EDO EJD EMK EPL EST ESX E~A E~B F5P GTTXZ H13 HF~ HZ~ H~9 H~P I-F IPNFZ J.P K1G KYCEM M4Z MK~ NA5 NX~ O9- P2P PQQKQ QWB RIG S-T SNACF TDBHL TFL TFW TH9 TNC TTHFI TUS TWF UT5 UU3 ZL0 ~S~ AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D F28 FR3 |
| ID | FETCH-LOGICAL-c366t-be6b0325db0b0b3421e0b8ba449ee4de90657dcf5cc7c4b00144daeae99a16a53 |
| ISSN | 0883-9514 |
| IngestDate | Sun Sep 28 11:18:19 EDT 2025 Sun Jun 29 16:17:52 EDT 2025 Wed Oct 01 02:45:48 EDT 2025 Thu Apr 24 22:59:44 EDT 2025 Mon Oct 20 23:46:12 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c366t-be6b0325db0b0b3421e0b8ba449ee4de90657dcf5cc7c4b00144daeae99a16a53 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 |
| PQID | 859354194 |
| PQPubID | 53050 |
| PageCount | 19 |
| ParticipantIDs | crossref_primary_10_1080_08839514_2011_534590 proquest_miscellaneous_869799409 crossref_citationtrail_10_1080_08839514_2011_534590 informaworld_taylorfrancis_310_1080_08839514_2011_534590 proquest_journals_859354194 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2/28/2011 |
| PublicationDateYYYYMMDD | 2011-02-28 |
| PublicationDate_xml | – month: 02 year: 2011 text: 2/28/2011 day: 28 |
| PublicationDecade | 2010 |
| PublicationPlace | Philadelphia |
| PublicationPlace_xml | – name: Philadelphia |
| PublicationTitle | Applied artificial intelligence |
| PublicationYear | 2011 |
| Publisher | Taylor & Francis Group Taylor & Francis Ltd |
| Publisher_xml | – name: Taylor & Francis Group – name: Taylor & Francis Ltd |
| References | CIT0010 CIT0001 CIT0012 CIT0011 Han J. (CIT0005) 2001 Hartigan J. A. (CIT0006) 1975 CIT0003 CIT0014 CIT0013 CIT0016 CIT0004 CIT0015 CIT0007 CIT0018 Witten I. H. (CIT0017) 2005 CIT0009 CIT0008 |
| References_xml | – volume-title: Data mining: Practical machine learning tools and techniques, year: 2005 ident: CIT0017 – ident: CIT0007 doi: 10.1145/331499.331504 – ident: CIT0011 doi: 10.1007/3-540-48412-4_4 – ident: CIT0018 – ident: CIT0014 – ident: CIT0012 doi: 10.1109/TKDE.2007.1066 – ident: CIT0015 doi: 10.1109/TKDE.2006.106 – volume-title: Data mining: Concepts and techniques year: 2001 ident: CIT0005 – ident: CIT0008 doi: 10.1109/TKDE.2007.1048 – ident: CIT0013 doi: 10.1109/TKDE.2005.75 – volume-title: Clustering algorithms year: 1975 ident: CIT0006 – ident: CIT0001 – ident: CIT0004 – ident: CIT0010 doi: 10.1006/jpdc.1997.1404 – ident: CIT0003 doi: 10.1137/1.9781611972733.6 – ident: CIT0009 doi: 10.1109/ICTAI.2007.31 – ident: CIT0016 doi: 10.1145/319950.320054 |
| SSID | ssj0001771 |
| Score | 1.9481369 |
| Snippet | Clustering is a data analysis technique, particularly useful when there are many dimensions and little prior information about the data. Partitional clustering... |
| SourceID | proquest crossref informaworld |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 97 |
| SubjectTerms | Acceptability Accuracy Algorithms Artificial intelligence Cluster analysis Clustering Clusters Convergence Data analysis Data processing Expert systems Mathematical models Numerical analysis Preprocessing |
| Title | k-ATTRACTORS: A PARTITIONAL CLUSTERING ALGORITHM FOR NUMERIC DATA ANALYSIS |
| URI | https://www.tandfonline.com/doi/abs/10.1080/08839514.2011.534590 https://www.proquest.com/docview/859354194 https://www.proquest.com/docview/869799409 |
| Volume | 25 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVEBS databaseName: Academic Search Ultimate customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn eissn: 1087-6545 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001771 issn: 0883-9514 databaseCode: ABDBF dateStart: 19960201 isFulltext: true titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn providerName: EBSCOhost – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1087-6545 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001771 issn: 0883-9514 databaseCode: ADMLS dateStart: 19870101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVAWR databaseName: Taylor & Francis Science and Technology Library-DRAA customDbUrl: eissn: 1087-6545 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001771 issn: 0883-9514 databaseCode: 30N dateStart: 19970101 isFulltext: true titleUrlDefault: http://www.tandfonline.com/page/title-lists providerName: Taylor & Francis |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLdgXLjwjSgD5APiUgUlsZ00u0Vlo51Ki1pHKqfIdhyJbVrLll721_McOx_VJgaoUlS5dqz6_fz8bL_3ewh99KNEEFlGHg21DxsUkXgjIrVX6kAJn0jJdO1tMY8mGT1ds3XnOlRHl1Tys7q5M67kf6QKZSBXEyX7D5JtXwoF8B3kC0-QMDz_SsbnXsr5Mh3zxXJlQ8y_g306dfy241m24jVl1DCdfV0sp3zybQibvuE8q1M9Dr-kHNpA1R-r6apvpTamqenSMUz87FF3tkpaGB-ZzXazu7DOeq1ZnJrMxKb-9V4IGT8DY__MsRp0Z-HnV_2iojtTbUO6G01FPDDV7MmAtprUB-0VMcsV2ahaG-PsIBX29Kb10XUrcGADPG8pd-cNCZ2Zviz9KiOU2Xyj-1za80V-ks1mOT9e80_bX55JM2au413OlYfoUQjLgMn1Qfx5u3QHcb1Db_9PE2tpyNjv6HbPltljur21stfmCn-Gnrh9Bk4taJ6jB_ryBXra5PDATqW_RKd9DB3hFPcQhDsE4RZBGBCEHYKwQRBuEPQKZSfHfDzxXH4NT5EoqjypI-mTkBXShw-hYaB9OZKC0kRrWugEzNO4UCVTKlbUmNeUFkILnSQiiAQjr9HBJaDpDcJlyZJIwItCoqgQsOuWZUxGihIZC6b8ASLNSOXKkc-bHCgXedBw1Lrxzc345nZ8B8hrW20t-co99Ud9IeRVfehV2gw1Oflz08NGYLmb4de54QJkNEjoAOH2V1C_5k4NZthmB1Uicy9O_eTt_VUO0eNu9rxDB9XVTr8Hk7aSH2oY_gYWzZBW |
| linkProvider | Taylor & Francis |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELZQGWDhjSjl4YHVJakfqdmiQmmhtKhNJZgs23UWUEGQLvx67DipKAiQQBnjc-LHnb-zz98BcBIwLrFKGSINE1gHRXLUxMqg1IRaBlgpavJoiz7rjMnVHS2jCV-LsErnQ6eeKCK31U653WZ0GRJ3ajUDW2RAPAMnxYRy67UvU4v1XRIDHPTnxjiMcp_LSSAnUt6e-6aWhdVpgbv0i63OF6D2OlDlr_u4k4f6LFN1_faJ1fFfbdsAawU8hbGfT5tgyUy3wHqZ-gEWlmAbXD2gOEmGcSsZDEdnMIa3Fhh3PbEubPXGoyTnqoJx73Iw7CadG2i9Tdgf5zkm4XmcxDC2Re9H3dEOGLcvklYHFWkZkMaMZUgZpgLcoBMV2AeTRmgC1VSSEG4MmRhuUU000SnVOtLEoTJCJtJIw7kMmaR4F1SmT1OzB2CaUs6kraiBNZHSOmsqjXBTE6wiSXVQBbgcDqELznKXOuNRhCW1adFdwnWX8N1VBWgu9ew5O34p3_w40iLL90pSn9hE4J9Fa-WsEIXyvwpHIUdJyEkVwPlbq7XuKEZOzdPMFmHuONX61vt___YxWOkkNz3R6_ava2DVb3a7u_YHoJK9zMyhRUuZOsr14R2ClP_g |
| linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PT9swFLYmkCYu62BM6xjgA1d3Sf0j9W5RS2mhK6hNpe5k2Y59AaVopBf--tlxUhWmDQmUY_ycxPZ7eZ_93vcAOIsYl1hZhkjXRA6gSI56WBlkTaxlhJWipoq2mLLRglwu6XIri9-HVXoMbQNRRGWrvXLf57aJiPvuFAM7x4AEAk6KCeUOtO8yfyjmkzii6cYWx0kFubwE8iJN8tw_ennyc3pCXfqXqa7-P8MWkM2bh7CT2866VB39-IzU8S2f9hF8qJ1TmIbVtA_emeIAtJrCD7C2A5_A5S1Ks2yW9rPr2fwHTOGNc4vHgVYX9ieLeVYxVcF0cnE9G2ejn9BhTThdVBUm4SDNUpi6pr_m4_khWAzPs_4I1UUZkMaMlUgZpiLcpbmK3IVJNzaR6ilJCDeG5IY7nybJtaVaJ5p4n4yQXBppOJcxkxR_BjvFqjBfALSWciZdR12siZQOqimb4J4mWCWS6qgNcDMbQteM5b5wxp2IG2LTeriEHy4RhqsN0EbqPjB2vNC-tz3Roqx2SmwoayLw_0WPmkUhatV_EJ5AjpKYkzaAm7tOZ_1BjCzMau2aMH-Y6pD119c_-xS8vxkMxWQ8vToCe2Gn2yfafwM75e-1OXauUqlOKm34Az6x_oQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=k-ATTRACTORS%3A+A+PARTITIONAL+CLUSTERING+ALGORITHM+FOR+NUMERIC+DATA+ANALYSIS&rft.jtitle=Applied+artificial+intelligence&rft.au=Kanellopoulos%2C+Y&rft.au=Antonellis%2C+P&rft.au=Tjortjis%2C+C&rft.au=Makris%2C+C&rft.date=2011-02-28&rft.issn=0883-9514&rft.eissn=1087-6545&rft.volume=25&rft.issue=2&rft.spage=97&rft.epage=115&rft_id=info:doi/10.1080%2F08839514.2011.534590&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0883-9514&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0883-9514&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0883-9514&client=summon |