K-means clustering algorithm and Python implementation

K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the...

Full description

Saved in:

Bibliographic Details
Published in	2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE) pp. 55 - 59
Main Author	Wu, BoKai
Format	Conference Proceeding
Language	English
Published	IEEE 20.08.2021
Subjects	API Clustering algorithms Codes Data analysis Java K-means algorithm Machine learning Machine learning algorithms Social networking (online)
Online Access	Get full text
DOI	10.1109/CSAIEE54046.2021.9543260

Cover

Abstract	K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the same cluster is high, thus, the similarity of data in different clusters is low. K-means algorithm is a typical distance-based clustering algorithm. It takes distance as the evaluation index of similarity, that is, the closer the distance between two objects, the greater similarity. Clustering is also extremely extensive in practical applications, such as: market segmentation, social network analysis, organized computing clusters, and astronomical data analysis. This paper is my own attempt to make K-means code and API, using Python and Java to jointly complete a project. The Python is mainly used to write the framework of the core algorithm of K-means, and the Java to create experimental data. In this research report, I will describe the simple data model provided by K-means, as well as the design and implementation of K-means.
AbstractList	K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the same cluster is high, thus, the similarity of data in different clusters is low. K-means algorithm is a typical distance-based clustering algorithm. It takes distance as the evaluation index of similarity, that is, the closer the distance between two objects, the greater similarity. Clustering is also extremely extensive in practical applications, such as: market segmentation, social network analysis, organized computing clusters, and astronomical data analysis. This paper is my own attempt to make K-means code and API, using Python and Java to jointly complete a project. The Python is mainly used to write the framework of the core algorithm of K-means, and the Java to create experimental data. In this research report, I will describe the simple data model provided by K-means, as well as the design and implementation of K-means.
Author	Wu, BoKai
Author_xml	– sequence: 1 givenname: BoKai surname: Wu fullname: Wu, BoKai email: 204911@student.upm.edu.my organization: University Putra,Department of computer science and information technology,Malaysia
BookMark	eNotj71uwjAYAI3UDoX2Cbr4BZL6c_wTjyhKWwQSSG1nZCefwVLsoMQMvH1Vlem2O92SPKQxISEUWAnAzFvztd60rRRMqJIzDqWRouKKLcgSlJKCcybgiahtEdGmmXbDdc44hXSidjiNU8jnSG3q6eGWz2OiIV4GjJiyzWFMz-TR22HGlztX5Oe9_W4-i93-Y9Osd0UAqHNhnKyd80IjQGdcrSxTWnCDf_m6c9hzdMhMr8GCRuUMV5U2XvYe0Tpfrcjrvzcg4vEyhWin2_G-Uv0C5WxE9Q
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CSAIEE54046.2021.9543260
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	1665422041 9781665422048
EndPage	59
ExternalDocumentID	9543260
Genre	orig-research
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i118t-9b58bbf47e11c9b86a067429e42208cbed2ebe09d71a17e6b926379f5dfeeabf3
IEDL.DBID	RIE
IngestDate	Thu Jun 29 18:37:34 EDT 2023
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i118t-9b58bbf47e11c9b86a067429e42208cbed2ebe09d71a17e6b926379f5dfeeabf3
PageCount	5
ParticipantIDs	ieee_primary_9543260
PublicationCentury	2000
PublicationDate	2021-Aug.-20
PublicationDateYYYYMMDD	2021-08-20
PublicationDate_xml	– month: 08 year: 2021 text: 2021-Aug.-20 day: 20
PublicationDecade	2020
PublicationTitle	2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)
PublicationTitleAbbrev	CSAIEE
PublicationYear	2021
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.7706246
Snippet	K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number...
SourceID	ieee
SourceType	Publisher
StartPage	55
SubjectTerms	API Clustering algorithms Codes Data analysis Java K-means algorithm Machine learning Machine learning algorithms Social networking (online)
Title	K-means clustering algorithm and Python implementation
URI	https://ieeexplore.ieee.org/document/9543260
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA61J08qrfgmB49mm-xmd5OjFEtRKgUt9FbymNWi3YrsHvTXO9k-RPHgLSSBPCbwZSbffCHkEo-FVVxKZlLHmVQ2ZxaBiSXcc69zA0UTzBndZ8OJvJ2m0xa52ubCAEBDPoMoFJu3fL90dQiV9XQq8baBDvpOrrJVrtaGnMN1r_9wjQ4S3kBkoB7EIlp3__FvSgMbgz0y2gy4You8RHVlI_f5S4vxvzPaJ93vBD063kLPAWlB2SHZHVsAIg91r3WQP8AWal6fluj-Py-oKT0dfwSlADpfbEjjwSpdMhncPPaHbP0tApujN1AxbVNlbSFzEMJpqzKDiIOwAjKOuXIWfIyW4drnwogcMqvjLMl1kfoCwNgiOSTtclnCEaGpcgYxTLhEB5kdpaUsMmUceG8FGDgmnbDm2dtK-WK2Xu7J39WnZDfse4i4xvyMtKv3Gs4Rsit70djqC9RgmFc
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA6lHvSk0opv9-DRbLO7yW5ylGKp9kHBFnorecxqsd2VsnvQX2-yfYjiwVtICEmYwDcz-eYLQrf2WihOKMWSaYIpVwlWFphwRAwxIpGQVsmcwTDuTujTlE1r6G5XCwMAFfkMfNes3vJNrkuXKmsJRq23YQP0PUYpZetqrS09h4hW-_nehkjWB6GOfBAG_mbCj59TKuDoHKLBdsk1X-TNLwvl689faoz_3dMRan6X6HmjHfgcoxpkDRT38BIs9nh6UToBBDviycVLvpoXr0tPZsYbfTitAG--3NLGnV2aaNJ5GLe7ePMxAp7beKDAQjGuVEoTCAItFI-lxRwLLEDDkHCtwITWNkSYJJBBArESYRwlImUmBZAqjU5QPcszOEUe41paFAt0JJzQDheUpjGXGoxRAUg4Qw135tn7Wvtitjnu-d_dN2i_Ox70Z_3HYe8CHTgbuPxrSC5RvViVcGUBvFDXld2-AP8pm6Q
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+IEEE+International+Conference+on+Computer+Science%2C+Artificial+Intelligence+and+Electronic+Engineering+%28CSAIEE%29&rft.atitle=K-means+clustering+algorithm+and+Python+implementation&rft.au=Wu%2C+BoKai&rft.date=2021-08-20&rft.pub=IEEE&rft.spage=55&rft.epage=59&rft_id=info:doi/10.1109%2FCSAIEE54046.2021.9543260&rft.externalDocID=9543260