Comparison of machine learning models applied on anonymized data with different techniques

Anonymization techniques based on obfuscating the quasi-identifiers by means of value generalization hierarchies are widely used to achieve preset levels of privacy. To prevent different types of attacks against database privacy it is necessary to apply several anonymization techniques beyond the cl...

Full description

Saved in:

Bibliographic Details
Published in	2023 IEEE International Conference on Cyber Security and Resilience (CSR) pp. 618 - 623
Main Authors	Diaz, Judith Sainz-Pardo, Garcia, Alvaro Lopez
Format	Conference Proceeding
Language	English
Published	IEEE 31.07.2023
Subjects	Data models Data privacy Decision making Information filtering Machine learning Privacy
Online Access	Get full text
DOI	10.1109/CSR57506.2023.10224917

Cover

Abstract	Anonymization techniques based on obfuscating the quasi-identifiers by means of value generalization hierarchies are widely used to achieve preset levels of privacy. To prevent different types of attacks against database privacy it is necessary to apply several anonymization techniques beyond the classical k-anonymity or l-diversity. However, the application of these methods is directly connected to a reduction of their utility in prediction and decision making tasks. In this work we study four classical machine learning methods currently used for classification purposes in order to analyze the results as a function of the anonymization techniques applied and the parameters selected for each of them. The performance of these models is studied when varying the value of k for k-anonymity and additional tools such as {\ell}-diversity , t-closeness and {\delta}-disclosure\ privacy are also deployed on the well-known adult dataset.
AbstractList	Anonymization techniques based on obfuscating the quasi-identifiers by means of value generalization hierarchies are widely used to achieve preset levels of privacy. To prevent different types of attacks against database privacy it is necessary to apply several anonymization techniques beyond the classical k-anonymity or l-diversity. However, the application of these methods is directly connected to a reduction of their utility in prediction and decision making tasks. In this work we study four classical machine learning methods currently used for classification purposes in order to analyze the results as a function of the anonymization techniques applied and the parameters selected for each of them. The performance of these models is studied when varying the value of k for k-anonymity and additional tools such as {\ell}-diversity , t-closeness and {\delta}-disclosure\ privacy are also deployed on the well-known adult dataset.
Author	Diaz, Judith Sainz-Pardo Garcia, Alvaro Lopez
Author_xml	– sequence: 1 givenname: Judith Sainz-Pardo surname: Diaz fullname: Diaz, Judith Sainz-Pardo email: sainzpardo@ifca.unican.es organization: Instituto de Física de Cantabria (IFCA), CSIC-UC,Santander,Spain,39005 – sequence: 2 givenname: Alvaro Lopez surname: Garcia fullname: Garcia, Alvaro Lopez email: aloga@ifca.unican.es organization: Instituto de Física de Cantabria (IFCA), CSIC-UC,Santander,Spain,39005
BookMark	eNo1j81KxDAUhSPoQsd5A5G8QGtu0iTNUop_MCDorNwMt82NDbRpbSsyPr0FdXU48HE-zgU7TUMixq5B5ADC3VSvL9pqYXIppMpBSFk4sCds66wrlRYKwAp3zt6qoR9xivOQ-BB4j00bE_GOcEoxvfN-8NTNHMexi-T5SuFqOvbxe20eF-RfcWm5jyHQRGnhCzVtih-fNF-ys4DdTNu_3LD9_d2-esx2zw9P1e0ui1LLJcMgZSiEKYUMVJcgnXHWNMZrgw1I48kIQAW11HVD5AJ4hQ26QllA8mrDrn5nIxEdxin2OB0P_4_VD3VcUh4
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CSR57506.2023.10224917
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9798350311709
EndPage	623
ExternalDocumentID	10224917
Genre	orig-research
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i252t-af22f406802feb81296976c6d56ac126de601a31b25bcee9f1d3aca94371aed3
IEDL.DBID	RIE
IngestDate	Thu Jan 18 11:14:25 EST 2024
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i252t-af22f406802feb81296976c6d56ac126de601a31b25bcee9f1d3aca94371aed3
OpenAccessLink	https://doi.org/10.1109/CSR57506.2023.10224917
PageCount	6
ParticipantIDs	ieee_primary_10224917
PublicationCentury	2000
PublicationDate	2023-July-31
PublicationDateYYYYMMDD	2023-07-31
PublicationDate_xml	– month: 07 year: 2023 text: 2023-July-31 day: 31
PublicationDecade	2020
PublicationTitle	2023 IEEE International Conference on Cyber Security and Resilience (CSR)
PublicationTitleAbbrev	CSR
PublicationYear	2023
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.9014192
Snippet	Anonymization techniques based on obfuscating the quasi-identifiers by means of value generalization hierarchies are widely used to achieve preset levels of...
SourceID	ieee
SourceType	Publisher
StartPage	618
SubjectTerms	Data models Data privacy Decision making Information filtering Machine learning Privacy
Title	Comparison of machine learning models applied on anonymized data with different techniques
URI	https://ieeexplore.ieee.org/document/10224917
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFA-6kycVJ36Tg9d0bdK0zXk4huAQnTC8jOTlRYboxHWX_fUmaTtRELy1JZBHcngf_X0Qcg1lZlUlgWlRSpar1DJVGc2C9FeRAnAX1fXvJsX4Kb-dyVlLVo9cGESM4DNMwmP8l2-XsA6jskGUP_P9xS7ZLUvVkLVa1m-WqsHw8cEXH2kAHnCRdIt_2KbErDHaJ5NuvwYs8pqsa5PA5pcU478DOiD9b4Ievd-mnkOyg-9H5Hm49RSkS0ffIkwSaesL8UKj6c2K6qbupH6Vjr3_YuPfAlKUhqEs7SxTarrVd131yXR0Mx2OWWudwBZc8pppx7nLg68Gd2h8EleFrzugsLLQkPHCom_EtMgMl8bHqlxmhQatclFmGq04Jj0fAZ4Qanx9oXPt2wpQuQRUInXOVJUDiWBKd0r64VzmH404xrw7krM_vp-TvXA9zXj0gvTqzzVe-rxem6t4n1_2UaXx
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH7oPOhJxYm_zcFrujZp2uY8HFO3ITpheBn5KUN04rrL_nqTtJ0oCN7aEsgjObz3vX7v-wCuVJ5oXjCFBc0ZTnmsMS-kwF76K4uVIjao6w9HWf8pvZ2wST2sHmZhjDGBfGYi_xj-5eu5WvpWWSfInzl8sQlbzMGKvBrXqud-k5h3uo8PrvyIPfWA0KhZ_sM4JeSN3i6Mmh0rushrtCxlpFa_xBj_HdIetL9H9ND9Ovnsw4Z5P4Dn7tpVEM0tegtESYNqZ4gXFGxvFkhUlSdyq0RA_7OVe_NcUeTbsqgxTSnRWuF10YZx73rc7ePaPAHPCCMlFpYQm3pnDWKNdGmcZ67yUJlmmVAJybRxUEzQRBImXazcJpoKJXhK80QYTQ-h5SIwR4CkqzBEKhywUDxlynAaWyuLwipmlMztMbT9uUw_KnmMaXMkJ398v4Tt_ng4mA5uRnensOOvqmqWnkGr_Fyac5flS3kR7vYL2kypQg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+International+Conference+on+Cyber+Security+and+Resilience+%28CSR%29&rft.atitle=Comparison+of+machine+learning+models+applied+on+anonymized+data+with+different+techniques&rft.au=Diaz%2C+Judith+Sainz-Pardo&rft.au=Garcia%2C+Alvaro+Lopez&rft.date=2023-07-31&rft.pub=IEEE&rft.spage=618&rft.epage=623&rft_id=info:doi/10.1109%2FCSR57506.2023.10224917&rft.externalDocID=10224917