Data Privacy : foundations, new developments and the big data challenge

This book offers a broad, cohesive overview of the field of data privacy. It discusses, from a technological perspective, the problems and solutions of the three main communities working on data privacy: statistical disclosure control (those with a statistical background), privacy-preserving data mi...

Full description

Saved in:
Bibliographic Details
Main Author Torra, Vicenç
Format Electronic eBook
LanguageEnglish
Published Cham : Springer International Publishing, 2017.
SeriesStudies in big data.
Subjects
Online AccessFull text
ISBN9783319573588
9783319573564
Physical Description1 online resource (xiv, 269 pages) : illustrations

Cover

Table of Contents:
  • Preface; Organization; How to Use This book; Acknowledgements; Contents; 1 Introduction; 1.1 Motivations for Data Privacy; 1.2 Privacy and Society; 1.3 Terminology; 1.3.1 The Framework; 1.3.2 Anonymity and Unlinkability; 1.3.3 Disclosure; 1.3.4 Undetectability and Unobservability; 1.3.5 Pseudonyms and Identity; 1.3.6 Transparency; 1.4 Privacy and Disclosure; 1.5 Privacy by Design; 2 Machine and Statistical Learning; 2.1 Classification of Techniques; 2.2 Supervised Learning; 2.2.1 Classification; 2.2.2 Regression; 2.2.3 Validation of Results: k-Fold Cross-Validation; 2.3 Unsupervised Learning.
  • 2.3.1 Clustering; 2.3.2 Association Rules; 2.3.3 Expectation-Maximization Algorithm ; 3 On the Classification of Protection Procedures; 3.1 Dimensions; 3.1.1 On Whose Privacy Is Being Sought; 3.1.2 On the Computations to be Done; 3.1.3 On the Number of Data Sources; 3.1.4 Knowledge Intensive Data Privacy ; 3.1.5 Other Dimensions and Discussion; 3.1.6 Summary; 3.2 Respondent and Holder Privacy ; 3.3 Data-Driven Methods ; 3.4 Computation-Driven Methods ; 3.4.1 Single Database: Differential Privacy; 3.4.2 Multiple Databases: Cryptographic Approaches; 3.4.3 Discussion.
  • 3.5 Result-Driven Approaches 3.6 Tabular Data; 3.6.1 Cell Suppression; 3.6.2 Controlled Tabular Adjustment; 4 User's Privacy; 4.1 User Privacy in Communications; 4.1.1 Protecting the Identity of the User; 4.1.2 Protecting the Data of the User; 4.2 User Privacy in Information Retrieval; 4.2.1 Protecting the Identity of the User; 4.2.2 Protecting the Query of the User; 4.3 Private Information Retrieval; 4.3.1 Information-Theoretic PIR with k Databases; 4.3.2 Computational PIR; 4.3.3 Other Contexts; 5 Privacy Models and Disclosure Risk Measures; 5.1 Definition and Controversies.
  • 5.1.1 A Boolean or Measurable Condition5.2 Attribute Disclosure; 5.2.1 Attribute Disclosure for a Numerical Variable; 5.2.2 Attribute Disclosure for a Categorical Variable; 5.3 Identity Disclosure; 5.3.1 An Scenario for Identity Disclosure; 5.3.2 Measures for Identity Disclosure; 5.3.3 Uniqueness; 5.3.4 Reidentification; 5.3.5 The Worst-Case Scenario; 5.4 Matching and Integration: A Database Based Approach; 5.4.1 Heterogenous Distributed Databases; 5.4.2 Data Integration; 5.4.3 Schema Matching; 5.4.4 Data Matching; 5.4.5 Preprocessing; 5.4.6 Indexing and Blocking.
  • 5.4.7 Record Pair Comparison: Distances and Similarities; 5.4.8 Classification of Record Pairs; 5.5 Probabilistic Record Linkage; 5.5.1 Alternative Expressions for Decision Rules; 5.5.2 Computation of Rp(a, b); 5.5.3 Estimation of the Probabilities; 5.5.4 Extensions for Computing Probabilities; 5.5.5 Final Notes; 5.6 Distance-Based Record Linkage; 5.6.1 Weighted Distances; 5.6.2 Distance and Normalization; 5.6.3 Parameter Determination for Record Linkage; 5.7 Record Linkage Without Common Variables; 5.8 k-Anonymity and Other Boolean Conditions for Identity Disclosure.