Efficient and Privacy-Preserving k-Means Clustering for Big Data Mining

Recent advances in sensing and storing technologies have led to big data age where a huge amount of data are distributed across sites to be stored and analysed. Indeed, cluster analysis is one of the data mining tasks that aims to discover patterns and knowledge through different algorithmic techniq...

Full description

Saved in:
Bibliographic Details
Published in2016 IEEE Trustcom/BigDataSE/ISPA pp. 791 - 798
Main Authors Gheid, Zakaria, Challal, Yacine
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.08.2016
Subjects
Online AccessGet full text
ISSN2324-9013
DOI10.1109/TrustCom.2016.0140

Cover

Abstract Recent advances in sensing and storing technologies have led to big data age where a huge amount of data are distributed across sites to be stored and analysed. Indeed, cluster analysis is one of the data mining tasks that aims to discover patterns and knowledge through different algorithmic techniques such as k-means. Nevertheless, running k-means over distributed big data stores has given rise to serious privacy issues. Accordingly, many proposed works attempted to tackle this concern using cryptographic protocols. However, these cryptographic solutions introduced performance degradation issues in analysis tasks which does not meet big data properties. In this work, we propose a novel privacy-preserving k-means algorithm based on a simple yet secure and efficient multiparty additive scheme that is cryptography-free. We designed our solution for horizontally partitioned data. Moreover, we demonstrate that our scheme resists against adversaries passive model.
AbstractList Recent advances in sensing and storing technologies have led to big data age where a huge amount of data are distributed across sites to be stored and analysed. Indeed, cluster analysis is one of the data mining tasks that aims to discover patterns and knowledge through different algorithmic techniques such as k-means. Nevertheless, running k-means over distributed big data stores has given rise to serious privacy issues. Accordingly, many proposed works attempted to tackle this concern using cryptographic protocols. However, these cryptographic solutions introduced performance degradation issues in analysis tasks which does not meet big data properties. In this work, we propose a novel privacy-preserving k-means algorithm based on a simple yet secure and efficient multiparty additive scheme that is cryptography-free. We designed our solution for horizontally partitioned data. Moreover, we demonstrate that our scheme resists against adversaries passive model.
Author Gheid, Zakaria
Challal, Yacine
Author_xml – sequence: 1
  givenname: Zakaria
  surname: Gheid
  fullname: Gheid, Zakaria
  email: z_gheid@esi.dz
  organization: Ecole Nat. SupIrieure d'Inf., Lab. des Methodes de Conception des Syst., Algiers, Algeria
– sequence: 2
  givenname: Yacine
  surname: Challal
  fullname: Challal, Yacine
  email: y_challal@esi.dz
  organization: Ecole Nat. SupIrieure d'Inf., Lab. des Methodes de Conception des Syst., Algiers, Algeria
BookMark eNo9kEFOwzAQRQ0CiVJ6AdjkAiljO07sJYRSkFrRRVlHk2RSDKlTOWmr3p5ERaxG-qP3v_Ru2ZVrHDF2z2HKOZjHtd-3XdpspwJ4PAUewQWbmERzBQakAMUv2UhIEYUGuLxhk7b9BgAhYiOVHrH5rKpsYcl1AboyWHl7wOIUrjy15A_WbYKfcEno2iCt-yXyQ1Q1Pni2m-AFOwyW1vXZHbuusG5p8nfH7PN1tk7fwsXH_D19WoRfPNYQYkkmVxHkBeSIipSACBKNsSqlKRXmlSFltE7yIjaUR9qUkquK-rcqUaEcM3nu3bsdno5Y19nO2y36U8YhG5Rk3aCkaLbZoCQblPTUw5myRPQPJDpKQEj5C9HOYOw
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
ADTOC
UNPAY
DOI 10.1109/TrustCom.2016.0140
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 2
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781509032051
1509032053
EISSN 2324-9013
EndPage 798
ExternalDocumentID oai:HAL:hal-01466904v1
7847023
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ADTOC
UNPAY
ID FETCH-LOGICAL-h1680-ade9b540bc0baa5e5204078a65d39d5abf9e59887bc69eb489d315fe5d35da5a3
IEDL.DBID RIE
IngestDate Wed Oct 29 14:04:44 EDT 2025
Wed Aug 27 02:07:47 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
License other-oa
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-h1680-ade9b540bc0baa5e5204078a65d39d5abf9e59887bc69eb489d315fe5d35da5a3
OpenAccessLink https://proxy.k.utb.cz/login?url=https://hal.archives-ouvertes.fr/hal-01466904
PageCount 8
ParticipantIDs ieee_primary_7847023
unpaywall_primary_10_1109_trustcom_2016_0140
PublicationCentury 2000
PublicationDate 2016-Aug.
PublicationDateYYYYMMDD 2016-08-01
PublicationDate_xml – month: 08
  year: 2016
  text: 2016-Aug.
PublicationDecade 2010
PublicationTitle 2016 IEEE Trustcom/BigDataSE/ISPA
PublicationTitleAbbrev TrustCom
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002269358
ssj0003204185
Score 1.8375412
Snippet Recent advances in sensing and storing technologies have led to big data age where a huge amount of data are distributed across sites to be stored and...
SourceID unpaywall
ieee
SourceType Open Access Repository
Publisher
StartPage 791
SubjectTerms Big data
Clustering algorithms
Data privacy
Distributed databases
efficiency
horizontally partitioned data
k-means clustering
privacy
Protocols
Security
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3BTgIxEG0UDsaLGjBi1PTg0eLCbrvboyJoTCAcJMHTZqbbBQJZiAENfr3TZUHjSa-dJrvpJH3vpTNvGLv2ES2aQItIWRIoBgIRoTJCIci0CUkoMa_y7amnQfA8lMOiWd31woyJcULhuSrmKzeTmARj-uYCwhmdkJoL9llZSaLeJVYe9Pp3r9tmGE_f5p0Krg6EgE3VnXwoxqYcsoNVtoD1B8xmPxCkc8R6229vCkem9dUS6-bzly3jn3_umFW_m_V4fwdDJ2zPZhX22M6tIQhROGQJxSfvYNbCFVy4yyEb8anoWoIp3pqtnFeCWyL-yu8nI_4AS-DdfHJElQ067ZfWkyhmJohxQ0WegMRqJBaGxkMAaWXTcy91oGTi60QCptpKTTcLGqUtBpFO_IZMLYVlAhL8U1bK5pk9Y1wSeqdRitqYgFArxDBCS3jepNwaH6HGKu5g48XGFiMOCeiIAdTYze6gd7Fcang63qYndumJXXrO_7f9gpVo0V4SE1jiVZH_L50GunA
  priority: 102
  providerName: Unpaywall
Title Efficient and Privacy-Preserving k-Means Clustering for Big Data Mining
URI https://ieeexplore.ieee.org/document/7847023
https://hal.archives-ouvertes.fr/hal-01466904
UnpaywallVersion submittedVersion
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED61MAALb_GWB8ampE3sxCOPFoTUqkMrwRTdOU6pWqUMLQh-Pee0DQgxsCVxbFln33139j0ALgMiSybUXqwsGygGQy8mZTxFKLMmppGkwsu3qx4G4eOTfKpArYyFsdYWzme27h6Lu_x0aubuqOwqYlHKGFOFahSrRaxWeZ7CaoS70ivfg6bv8rKs4mR8fdV3QQzMZc6fS9WdZbGsqLIFG_P8FT_ecTL5AS7tbeisprXwKRnX5zOqm89fGRv_O-8dOPgO4xO9EqB2oWLzPdhe1XEQS7beh_tWkUeCxxCYp9xl9Ibmw3PeGU6S5EMx9jqWMU3cTuYusYL7xMquuBkNxR3OUHSKMhMHMGi3-rcP3rLAgvfSULHvYWo1scpGxidEaSXTjFUGVDINdCqRMm2lZjFERmlLYazToCEzy80yRYnBIazl09wegZAM9VmckTYmZIiLKIrJMvg3eSOYgPAY9h1dktdFDo1kSZJjqJWkL9sKu8TXSRF1wnyXuAVL3IKd_D3KKWy6XxZ-eWewxh3tOesKM7ooNskFrA-6vevnL3dewJU
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED7xGICFVxFvPDA2aWjiJF6BlvJIxVAktujOcQBRBYYWBL-ec5IGhBjYktixrLPvvjv7HgDHPpEhHSgnDg0bKBoDJ6ZQOyGhzLuYRZJKL99hOLgLru7l_Ry0m1gYY0zpfGZc-1je5WcvemqPyjoRi1LGmHlYlEEQyCpaqzlRYUXCXuo1737Xs5lZZpEynuqMbBgD85n16Apda1vUNVVWYGlavOLHO47HP-ClvwrJbGKVV8mzO52Qqz9_5Wz878zXoPUdyCduG4hahzlTbMDqrJKDqBl7Ey56ZSYJHkNgkfEvT2-oPxzrn2FlSfEgnp3EMKqJs_HUplawn1jdFadPD-IcJyiSstBEC-76vdHZwKlLLDiPJ2HsOZgZRay0kfYIURrJNGOlAUOZ-SqTSLkyUrEgIh0qQ0GsMv9E5oabZYYS_S1YKF4Ksw1CMtjncU5K64BBLqIoJsPw3-WtoH3CHdi0dElfqywaaU2SHWg3pG_aSsvEU2kZd8Kcl9oFS-2C7f49yhEsDUbJTXpzObzeg2XbvfLS24cFHsQcsOYwocNyw3wBpLDCMg
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3BTgIxEG0UDsaLGjBi1PTg0eLCbrvboyJoTCAcJMHTZqbbBQJZiAENfr3TZUHjSa-dJrvpJH3vpTNvGLv2ES2aQItIWRIoBgIRoTJCIci0CUkoMa_y7amnQfA8lMOiWd31woyJcULhuSrmKzeTmARj-uYCwhmdkJoL9llZSaLeJVYe9Pp3r9tmGE_f5p0Krg6EgE3VnXwoxqYcsoNVtoD1B8xmPxCkc8R6229vCkem9dUS6-bzly3jn3_umFW_m_V4fwdDJ2zPZhX22M6tIQhROGQJxSfvYNbCFVy4yyEb8anoWoIp3pqtnFeCWyL-yu8nI_4AS-DdfHJElQ067ZfWkyhmJohxQ0WegMRqJBaGxkMAaWXTcy91oGTi60QCptpKTTcLGqUtBpFO_IZMLYVlAhL8U1bK5pk9Y1wSeqdRitqYgFArxDBCS3jepNwaH6HGKu5g48XGFiMOCeiIAdTYze6gd7Fcang63qYndumJXXrO_7f9gpVo0V4SE1jiVZH_L50GunA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+IEEE+Trustcom%2FBigDataSE%2FISPA&rft.atitle=Efficient+and+Privacy-Preserving+k-Means+Clustering+for+Big+Data+Mining&rft.au=Gheid%2C+Zakaria&rft.au=Challal%2C+Yacine&rft.date=2016-08-01&rft.pub=IEEE&rft.eissn=2324-9013&rft.spage=791&rft.epage=798&rft_id=info:doi/10.1109%2FTrustCom.2016.0140&rft.externalDocID=7847023