K-means clustering algorithm and Python implementation
K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the...
        Saved in:
      
    
          | Published in | 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE) pp. 55 - 59 | 
|---|---|
| Main Author | |
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        20.08.2021
     | 
| Subjects | |
| Online Access | Get full text | 
| DOI | 10.1109/CSAIEE54046.2021.9543260 | 
Cover
| Abstract | K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the same cluster is high, thus, the similarity of data in different clusters is low. K-means algorithm is a typical distance-based clustering algorithm. It takes distance as the evaluation index of similarity, that is, the closer the distance between two objects, the greater similarity. Clustering is also extremely extensive in practical applications, such as: market segmentation, social network analysis, organized computing clusters, and astronomical data analysis. This paper is my own attempt to make K-means code and API, using Python and Java to jointly complete a project. The Python is mainly used to write the framework of the core algorithm of K-means, and the Java to create experimental data. In this research report, I will describe the simple data model provided by K-means, as well as the design and implementation of K-means. | 
    
|---|---|
| AbstractList | K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the same cluster is high, thus, the similarity of data in different clusters is low. K-means algorithm is a typical distance-based clustering algorithm. It takes distance as the evaluation index of similarity, that is, the closer the distance between two objects, the greater similarity. Clustering is also extremely extensive in practical applications, such as: market segmentation, social network analysis, organized computing clusters, and astronomical data analysis. This paper is my own attempt to make K-means code and API, using Python and Java to jointly complete a project. The Python is mainly used to write the framework of the core algorithm of K-means, and the Java to create experimental data. In this research report, I will describe the simple data model provided by K-means, as well as the design and implementation of K-means. | 
    
| Author | Wu, BoKai | 
    
| Author_xml | – sequence: 1 givenname: BoKai surname: Wu fullname: Wu, BoKai email: 204911@student.upm.edu.my organization: University Putra,Department of computer science and information technology,Malaysia  | 
    
| BookMark | eNotj71uwjAYAI3UDoX2Cbr4BZL6c_wTjyhKWwQSSG1nZCefwVLsoMQMvH1Vlem2O92SPKQxISEUWAnAzFvztd60rRRMqJIzDqWRouKKLcgSlJKCcybgiahtEdGmmXbDdc44hXSidjiNU8jnSG3q6eGWz2OiIV4GjJiyzWFMz-TR22HGlztX5Oe9_W4-i93-Y9Osd0UAqHNhnKyd80IjQGdcrSxTWnCDf_m6c9hzdMhMr8GCRuUMV5U2XvYe0Tpfrcjrvzcg4vEyhWin2_G-Uv0C5WxE9Q | 
    
| ContentType | Conference Proceeding | 
    
| DBID | 6IE 6IL CBEJK RIE RIL  | 
    
| DOI | 10.1109/CSAIEE54046.2021.9543260 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| EISBN | 1665422041 9781665422048  | 
    
| EndPage | 59 | 
    
| ExternalDocumentID | 9543260 | 
    
| Genre | orig-research | 
    
| GroupedDBID | 6IE 6IL CBEJK RIE RIL  | 
    
| ID | FETCH-LOGICAL-i118t-9b58bbf47e11c9b86a067429e42208cbed2ebe09d71a17e6b926379f5dfeeabf3 | 
    
| IEDL.DBID | RIE | 
    
| IngestDate | Thu Jun 29 18:37:34 EDT 2023 | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | false | 
    
| Language | English | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-i118t-9b58bbf47e11c9b86a067429e42208cbed2ebe09d71a17e6b926379f5dfeeabf3 | 
    
| PageCount | 5 | 
    
| ParticipantIDs | ieee_primary_9543260 | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2021-Aug.-20 | 
    
| PublicationDateYYYYMMDD | 2021-08-20 | 
    
| PublicationDate_xml | – month: 08 year: 2021 text: 2021-Aug.-20 day: 20  | 
    
| PublicationDecade | 2020 | 
    
| PublicationTitle | 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE) | 
    
| PublicationTitleAbbrev | CSAIEE | 
    
| PublicationYear | 2021 | 
    
| Publisher | IEEE | 
    
| Publisher_xml | – name: IEEE | 
    
| Score | 1.7706246 | 
    
| Snippet | K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number... | 
    
| SourceID | ieee | 
    
| SourceType | Publisher | 
    
| StartPage | 55 | 
    
| SubjectTerms | API Clustering algorithms Codes Data analysis Java K-means algorithm Machine learning Machine learning algorithms Social networking (online)  | 
    
| Title | K-means clustering algorithm and Python implementation | 
    
| URI | https://ieeexplore.ieee.org/document/9543260 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA61J08qrfgmB49mm-xmd5OjFEtRKgUt9FbymNWi3YrsHvTXO9k-RPHgLSSBPCbwZSbffCHkEo-FVVxKZlLHmVQ2ZxaBiSXcc69zA0UTzBndZ8OJvJ2m0xa52ubCAEBDPoMoFJu3fL90dQiV9XQq8baBDvpOrrJVrtaGnMN1r_9wjQ4S3kBkoB7EIlp3__FvSgMbgz0y2gy4You8RHVlI_f5S4vxvzPaJ93vBD063kLPAWlB2SHZHVsAIg91r3WQP8AWal6fluj-Py-oKT0dfwSlADpfbEjjwSpdMhncPPaHbP0tApujN1AxbVNlbSFzEMJpqzKDiIOwAjKOuXIWfIyW4drnwogcMqvjLMl1kfoCwNgiOSTtclnCEaGpcgYxTLhEB5kdpaUsMmUceG8FGDgmnbDm2dtK-WK2Xu7J39WnZDfse4i4xvyMtKv3Gs4Rsit70djqC9RgmFc | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA6lHvSk0opv9-DRbLO7yW5ylGKp9kHBFnorecxqsd2VsnvQX2-yfYjiwVtICEmYwDcz-eYLQrf2WihOKMWSaYIpVwlWFphwRAwxIpGQVsmcwTDuTujTlE1r6G5XCwMAFfkMfNes3vJNrkuXKmsJRq23YQP0PUYpZetqrS09h4hW-_nehkjWB6GOfBAG_mbCj59TKuDoHKLBdsk1X-TNLwvl689faoz_3dMRan6X6HmjHfgcoxpkDRT38BIs9nh6UToBBDviycVLvpoXr0tPZsYbfTitAG--3NLGnV2aaNJ5GLe7ePMxAp7beKDAQjGuVEoTCAItFI-lxRwLLEDDkHCtwITWNkSYJJBBArESYRwlImUmBZAqjU5QPcszOEUe41paFAt0JJzQDheUpjGXGoxRAUg4Qw135tn7Wvtitjnu-d_dN2i_Ox70Z_3HYe8CHTgbuPxrSC5RvViVcGUBvFDXld2-AP8pm6Q | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+IEEE+International+Conference+on+Computer+Science%2C+Artificial+Intelligence+and+Electronic+Engineering+%28CSAIEE%29&rft.atitle=K-means+clustering+algorithm+and+Python+implementation&rft.au=Wu%2C+BoKai&rft.date=2021-08-20&rft.pub=IEEE&rft.spage=55&rft.epage=59&rft_id=info:doi/10.1109%2FCSAIEE54046.2021.9543260&rft.externalDocID=9543260 |