skip to main content
10.1145/3340531.3412105acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Securing Bloom Filters for Privacy-preserving Record Linkage

Published: 19 October 2020 Publication History

Abstract

Privacy-preserving record linkage (PPRL) facilitates the matching of records that correspond to the same real-world entities across different databases while preserving the privacy of the individuals in these databases. A Bloom filter (BF) is a space efficient probabilistic data structure that is becoming popular in PPRL as an efficient privacy technique to encode sensitive information in records while still enabling approximate similarity computations between attribute values. However, BF encoding is susceptible to privacy attacks which can re-identify the values that are being encoded. In this paper we propose two novel techniques that can be applied on BF encoding to improve privacy against attacks. Our techniques use neighbouring bits in a BF to generate new bit values. An empirical study on large real databases shows that our techniques provide high security against privacy attacks, and achieve better similarity computation accuracy and linkage quality compared to other privacy improvements that can be applied on BF encoding.

Supplementary Material

MP4 File (3340531.3412105.mp4)
In this presentation video, we describe our novel two hardening techniques that can be applied to Bloom filter encoding in privacy-preserving record linkage (PPRL) to improve security against privacy attacks. We first provide an overview of PPRL and Bloom filter encoding. We then describe the vulnerabilities in Bloom filter encoding and why hardening techniques are required to address these vulnerabilities. Next in the video, we describe our proposed hardening techniques which use neighbouring bits in a Bloom filter to generate new bit values. Then, we describe our empirical study on large real databases and show experimental results in terms of scalability, linkage quality, and privacy. We show that our hardening techniques provide high security against privacy attacks and achieve better similarity computation accuracy and linkage quality compared to other hardening techniques that can be applied to Bloom filter encoding. Finally, we conclude the presentation video with directions for future work.

References

[1]
J. Boyd et almbox. 2015. Accuracy and completeness of patient pathways--the benefits of national data linkage in Australia. BMC health services research (2015).
[2]
J. Chen, S. Swamidass, et almbox. 2005. ChemDB: a public database of small molecules and related chemoinformatics resources. Bioinformatics, Vol. 21, 22 (2005), 4133--4139.
[3]
P. Christen. 2012. Data Matching--Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer.
[4]
P. Christen, T. Ranbaduge, et almbox. 2018. Precise and Fast Cryptanalysis for Bloom Filter based Privacy-Preserving Record Linkage. IEEE TKDE, Vol. 31, 11 (2018).
[5]
C. Dwork. 2006. Differential privacy. ICALP (2006), 1--12.
[6]
M. Kuzu, M. Kantarcioglu, et almbox. 2011. A Constraint Satisfaction Cryptanalysis of Bloom Filters in Private Record Linkage. In PET. Waterloo, Canada, 226--245.
[7]
F. Niedermeyer, S. Steinmetzer, et almbox. 2014. Cryptanalysis of Basic Bloom Filters Used for Privacy Preserving Record Linkage. JPC, Vol. 6, 2 (2014), 59--79.
[8]
R. Schnell. 2015. Privacy-preserving Record Linkage. In Methodological Developments in Data Linkage. John Wiley & Sons, Inc., UK, 201--225.
[9]
R. Schnell, T. Bachteler, and J. Reiher. 2009. Privacy-preserving record linkage using Bloom filters. BMC Med Inform Decis Mak, Vol. 9 (2009).
[10]
R. Schnell and C. Borg. 2018. Hardening encrypted patient names against cryptographic attacks using cellular automata. In ICDMW DINA.
[11]
R. Schnell and C. Borgs. 2016a. Randomized Response and Balanced Bloom Filters for Privacy Preserving Record Linkage. In ICDMW. Barcelona, 218--224.
[12]
R. Schnell and C. Borgs. 2016b. XOR-Folding for Bloom Filter-based Encryptions for Privacy-preserving Record Linkage. German Record Linkage Center (2016).
[13]
D. Vatsalan, P. Christen, and V. Verykios. 2013. A Taxonomy of Privacy-Preserving Record Linkage Techniques. Information Systems, Vol. 38, 6 (2013).
[14]
D. Vatsalan, Z. Sehili, P. Christen, and E. Rahm. 2017. Privacy-Preserving Record Linkage for Big Data: Current Approaches and Research Challenges. Springer.
[15]
A. Vidanage, T. Ranbaduge, et almbox. 2019. Efficient Pattern Mining based Cryptanalysis for Privacy-Preserving Record Linkage. In IEEE ICDE.
[16]
S. Wolfram. 2002. A new kind of science. Wolfram media Champaign.

Cited By

View all
  • (2025)Secure privacy-preserving record linkage system from re-identification attackPLOS ONE10.1371/journal.pone.031448620:1(e0314486)Online publication date: 9-Jan-2025
  • (2024)Performance and Security Analysis of the Diffie-Hellman Key Exchange Protocol2024 19th International Workshop on Semantic and Social Media Adaptation & Personalization (SMAP)10.1109/SMAP63474.2024.00039(166-171)Online publication date: 21-Nov-2024
  • (2024)Encryption-based sub-string matching for privacy-preserving record linkageJournal of Information Security and Applications10.1016/j.jisa.2024.10371281(103712)Online publication date: Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
October 2020
3619 pages
ISBN:9781450368599
DOI:10.1145/3340531
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hardening
  2. perturbation
  3. random sampling
  4. sliding window
  5. xor

Qualifiers

  • Short-paper

Funding Sources

  • The Australian Research Council

Conference

CIKM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)4
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Secure privacy-preserving record linkage system from re-identification attackPLOS ONE10.1371/journal.pone.031448620:1(e0314486)Online publication date: 9-Jan-2025
  • (2024)Performance and Security Analysis of the Diffie-Hellman Key Exchange Protocol2024 19th International Workshop on Semantic and Social Media Adaptation & Personalization (SMAP)10.1109/SMAP63474.2024.00039(166-171)Online publication date: 21-Nov-2024
  • (2024)Encryption-based sub-string matching for privacy-preserving record linkageJournal of Information Security and Applications10.1016/j.jisa.2024.10371281(103712)Online publication date: Mar-2024
  • (2023)DGA Detection Using Similarity-Preserving Bloom EncodingsProceedings of the 2023 European Interdisciplinary Cybersecurity Conference10.1145/3590777.3590795(116-120)Online publication date: 14-Jun-2023
  • (2023)A Vulnerability Assessment Framework for Privacy-preserving Record LinkageACM Transactions on Privacy and Security10.1145/358964126:3(1-31)Online publication date: 27-Jun-2023
  • (2023)On the Privacy of Counting Bloom FiltersIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.315846920:2(1488-1499)Online publication date: 1-Mar-2023
  • (2022)A Review of Similarity Matching Over Encrypted Data2022 5th International Conference on Advances in Science and Technology (ICAST)10.1109/ICAST55766.2022.10039479(100-105)Online publication date: 2-Dec-2022
  • (2022)Accurate privacy-preserving record linkage for databases with missing valuesInformation Systems10.1016/j.is.2021.101959106:COnline publication date: 1-May-2022
  • (2022)A critique and attack on “Blockchain-based privacy-preserving record linkage”Information Systems10.1016/j.is.2021.101930108:COnline publication date: 1-Sep-2022
  • (2022)Privacy-preserving record linkage using autoencodersInternational Journal of Data Science and Analytics10.1007/s41060-022-00377-215:4(347-357)Online publication date: 16-Dec-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media