Bytewise Approximate Matching: The Good, The Bad, and The Unknown

Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all files on a seized device and compare the fingerprints against a reference database). However, with respect to the latter operation, an active...

Full description

Saved in:
Bibliographic Details
Published inThe journal of digital forensics, security and law Vol. 11; no. 2; p. 59
Main Authors Harichandran, Vikram, Breitinger, Frank, Baggili, Ibrahim
Format Journal Article
LanguageEnglish
Published Farmville Association of Digital Forensics, Security and Law 01.01.2016
Subjects
Online AccessGet full text
ISSN1558-7223
1558-7215
1558-7223
DOI10.15394/jdfsl.2016.1379

Cover

Abstract Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all files on a seized device and compare the fingerprints against a reference database). However, with respect to the latter operation, an active adversary can easily overcome this approach because traditional hashes are designed to be sensitive to altering an input; output will significantly change if a single bit is flipped. Therefore, researchers developed approximate matching, which is a rather new, less prominent area but was conceived as a more robust counterpart to traditional hashing. Since the conception of approximate matching, the community has constructed numerous algorithms, extensions, and additional applications for this technology, and are still working on novel concepts to improve the status quo. In this survey article, we conduct a high-level review of the existing literature from a non-technical perspective and summarize the existing body of knowledge in approximate matching, with special focus on bytewise algorithms. Our contribution allows researchers and practitioners to receive an overview of the state of the art of approximate matching so that they may understand the capabilities and challenges of the field. Simply, we present the terminology, use cases, classification, requirements, testing methods, algorithms, applications, and a list of primary and secondary literature.
AbstractList Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all files on a seized device and compare the fingerprints against a reference database). However, with respect to the latter operation, an active adversary can easily overcome this approach because traditional hashes are designed to be sensitive to altering an input; output will significantly change if a single bit is flipped. Therefore, researchers developed approximate matching, which is a rather new, less prominent area but was conceived as a more robust counterpart to traditional hashing. Since the conception of approximate matching, the community has constructed numerous algorithms, extensions, and additional applications for this technology, and are still working on novel concepts to improve the status quo. In this survey article, we conduct a high-level review of the existing literature from a non-technical perspective and summarize the existing body of knowledge in approximate matching, with special focus on bytewise algorithms. Our contribution allows researchers and practitioners to receive an overview of the state of the art of approximate matching so that they may understand the capabilities and challenges of the field. Simply, we present the terminology, use cases, classification, requirements, testing methods, algorithms, applications, and a list of primary and secondary literature.
Author Baggili, Ibrahim
Harichandran, Vikram
Breitinger, Frank
Author_xml – sequence: 1
  givenname: Vikram
  surname: Harichandran
  fullname: Harichandran, Vikram
– sequence: 2
  givenname: Frank
  surname: Breitinger
  fullname: Breitinger, Frank
– sequence: 3
  givenname: Ibrahim
  surname: Baggili
  fullname: Baggili, Ibrahim
BookMark eNqNkM9LwzAUx4NMcJvePRa82pmfbeJtE53CxMt2DmmTus6a1CRj7r-36zyIIHh63wff75f3PiMwsM4aAC4RnCBGBL3Z6Co0EwxRNkEkFydgiBjjaY4xGfzQZ2AUwgZCRjBHQzCd7aPZ1cEk07b17rN-V9EkzyqW69q-3ibLtUnmzunrXs1UJ5TV_bKyb9bt7Dk4rVQTzMX3HIPVw_3y7jFdvMyf7qaLtMSUxJRyVlRaQZZpjIwuGMm5IkwgWFaiQDmEnFJhRKaMhqzIIcFG0FJBjXBBdUXGAB17t7ZV-51qGtn67ly_lwjKnoHsGcgDA3lg0GWujpnutY-tCVFu3Nbb7kyJOGac5xklnQseXaV3IXhT_ac4-xUp66hi7Wz0qm7-Dn4Bwoh-wQ
CitedBy_id crossref_primary_10_1109_ACCESS_2022_3147809
crossref_primary_10_1155_2017_1306802
crossref_primary_10_1109_ACCESS_2022_3233403
ContentType Journal Article
Copyright Copyright Association of Digital Forensics, Security and Law 2016
Copyright_xml – notice: Copyright Association of Digital Forensics, Security and Law 2016
DBID AAYXX
CITATION
0-V
3V.
7XB
8AM
8FK
ABUWG
AFKRA
ALSLI
AZQEC
BENPR
BGRYB
CCPQU
DWQXO
K7.
M0O
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
PRQQA
Q9U
ADTOC
UNPAY
DOI 10.15394/jdfsl.2016.1379
DatabaseName CrossRef
ProQuest Social Sciences Premium Collection【Remote access available】
ProQuest Central (Corporate)
ProQuest Central (purchase pre-March 2016)
Criminal Justice Database (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Social Science Premium Collection
ProQuest Central Essentials
ProQuest Central
Criminology Collection
ProQuest One Community College
ProQuest Central Korea
ProQuest Criminal Justice (Alumni)
Criminal Justice Database
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest One Social Sciences
ProQuest Central Basic
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
Publicly Available Content Database
Social Science Premium Collection
ProQuest One Social Sciences
ProQuest One Academic Middle East (New)
ProQuest Central Basic
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest Central China
ProQuest Central
Criminology Collection
ProQuest Criminal Justice (Alumni)
Criminal Justice Periodicals (Alumni Edition)
ProQuest Criminal Justice
ProQuest Social Sciences Premium Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 2
  dbid: BENPR
  name: ProQuest Central
  url: http://www.proquest.com/pqcentral?accountid=15518
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Law
EISSN 1558-7223
ExternalDocumentID 10.15394/jdfsl.2016.1379
4204008451
10_15394_jdfsl_2016_1379
Genre Feature
GroupedDBID 0-V
5VS
8R4
8R5
AAYXX
ABUWG
ADBBV
AFKRA
ALMA_UNASSIGNED_HOLDINGS
ALSLI
ARALO
BCNDV
BENPR
BGRYB
BPHCQ
CCPQU
CITATION
DWQXO
EBS
EIS
EJD
EV9
FRJ
GROUPED_DOAJ
IPNFZ
KQ8
M0O
P2P
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PRQQA
PUEGO
Q2X
RIG
SJN
3V.
7XB
8FK
AZQEC
K7.
PKEHL
PQEST
PQUKI
PRINS
Q9U
ADTOC
UNPAY
ID FETCH-LOGICAL-c243t-485bfda056d21edb5378a35910cf9b17008449e96aed05b7032e94ca0d12b4df3
IEDL.DBID BENPR
ISSN 1558-7223
1558-7215
IngestDate Sun Sep 07 11:15:19 EDT 2025
Fri Jul 25 03:58:01 EDT 2025
Wed Oct 01 01:40:05 EDT 2025
Thu Apr 24 23:02:20 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License cc-by-nc
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c243t-485bfda056d21edb5378a35910cf9b17008449e96aed05b7032e94ca0d12b4df3
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
OpenAccessLink https://www.proquest.com/docview/1825887643?pq-origsite=%requestingapplication%&accountid=15518
PQID 1825887643
PQPubID 55195
ParticipantIDs unpaywall_primary_10_15394_jdfsl_2016_1379
proquest_journals_1825887643
crossref_primary_10_15394_jdfsl_2016_1379
crossref_citationtrail_10_15394_jdfsl_2016_1379
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2016-01-01
PublicationDateYYYYMMDD 2016-01-01
PublicationDate_xml – month: 01
  year: 2016
  text: 2016-01-01
  day: 01
PublicationDecade 2010
PublicationPlace Farmville
PublicationPlace_xml – name: Farmville
PublicationTitle The journal of digital forensics, security and law
PublicationYear 2016
Publisher Association of Digital Forensics, Security and Law
Publisher_xml – name: Association of Digital Forensics, Security and Law
SSID ssj0053281
Score 2.057213
Snippet Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all...
SourceID unpaywall
proquest
crossref
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
StartPage 59
SubjectTerms Algorithms
Automation
Computer forensics
Criminal investigations
Forensic sciences
Plagiarism
Researchers
Semantics
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT9tAEB5BONBLKaVVU6DyASEV1UnsfdhG4hAQJEIN4oCl9GTtetcqkDqocRTor--MH1A4gCpxG8uzK-_Mevab8fpbgJ0AF9Uw49o1gmmX28C6ymYGs1apeeBLY8pizuhMDmN-OhbjJRg0_8KgmX-RuYl1sskRu1Qkp53bGIbLDLE26oHHgmi3vHFbHFyZbDZZhhUpEJS3YCU-O-__KNlSRehiniMeZJ_VHywFi3i3bEibvGSHOny8QD2gztV5fqPuFmoy-WcBOlmDn82jV_tOrjvzQnfSP09YHV9hbO_gbQ1SnX6ltA5LNn8Py9_VYgP6h3eFXVzOrNMnNvLbS0S81hlhQKdS1r6D084ZTKfmWykdKhRUbsqLOKcCXv4B4pPji6OhWx_D4KY-Z4XLQ6Ezo9Cpxves0YIFoWICcUaaRZr4_ULOIxtJZU1PaAwhvo14qnrG8zU3GfsIrXya20_gMO3JVBupNXXtR5qHVkWZl6UaoZJM29BtbJ-kNUc5HZUxSShXIW8lpSkS8lZCJmrD1_sWNxU_xzO6W407k_pNnSWYXwkMtAjM2rB37-IX-_r8P8qb8IbkqnKzBa3i99xuI5Yp9Jd6nv4FM_n01g
  priority: 102
  providerName: Unpaywall
Title Bytewise Approximate Matching: The Good, The Bad, and The Unknown
URI https://www.proquest.com/docview/1825887643
https://commons.erau.edu/cgi/viewcontent.cgi?article=1379&context=jdfsl
UnpaywallVersion publishedVersion
Volume 11
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1558-7223
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053281
  issn: 1558-7215
  databaseCode: KQ8
  dateStart: 20130101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 1558-7223
  dateEnd: 20220131
  omitProxy: true
  ssIdentifier: ssj0053281
  issn: 1558-7215
  databaseCode: BENPR
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV07T8MwED7RMsCCeIpCQRlYQISS2M4DCaEUwpskQKOWKbJjZ0BVKLRVxb_HzgOYYIls6XLD59zbuQPYs6VRdTLMdE4Q07GwhU5FxmXUajFsmxbnRTLnIbCuY3w7IIM5COp_YdS1ylonFoqav6UqR96RfjCRAiEN6NnoXVdTo1R1tR6hQavRCvy0aDHWgHlTdcZqwnzXD6KnWjcTZDplB1Xi6DL2IVXhkiAXd155NlbFCMM6MpC62vXbUP14nwvTfEQ_Z3Q4_GWILpdhqfIgNa888hWYE_kqNO7pbA287kvP7988-5oXRU_h4ObB6_mafJyrpNSJ1rv2taswvDgsVl1PLrzgotjEwV0Q9oN1iC99Sa5XMxL01MRoomOHsIxTiTg3DcEZQbZDEZFOQJq5TDXfczB2hWtRwY8Jk_JtChen9JgbJsM8QxvQzN9ysQkaYoaVMm4xplibLsOOoG5mZCmTfoyVtqBTA5KkVQNxNcdimKhAQkGYFBAmCsJEQdiC_e83RmXzjD9o2zXGSSVG4-Tn0Ftw8I37v7y2_ua1DYuKtEyktKE5-ZiKHelaTNhu9b3sQuPu0ZG7OIi8ly-hbcgI
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9tAEB7xONALokDVAC0-wAFUN3gfjl0JIYeYJiRxIuoo4WR2vesDikLaBEX8uf62zjo2cKInLtZa8o6sb2d2XrszAEc1VKpexqStOJU20zVtC50p9FpdyWrEVSoP5nQjtzlg1yM-WoG_5V0Yc6yy3BPzjVo9pCZGXkU7mKNAoAK9mP62Tdcok10tW2iIorWCOs9LjBUXO9r6aYEu3Oy81cD1PibkKowvm3bRZcBOCaNzm3lcZkrgPyviaCU5rXmCclSjaeZLU77OY8zXviu0OuMSJYRon6XiTDlEMpVRpLsK64wyH52_9XoY9W9KXcAp8ZYVW7lno6_Fi0Qppz6r3qtsZpIfjvvdoeYo2WvF-GLtbjxOpuJpIcbjV4rvags2C4vVCpYs9hFW9GQbVjtisQNB_TYOh61foRX0-ze9UasbxKGFj0sTBPthxc3Q-tnrNb7lo3qAgyBq5C-DqB31htEuDN4FrU-wNnmY6M9gUem4qVSulIY08SXztPAzJ0sl2k1uWoFqCUiSFgXLTd-McWIcFwNhkkOYGAgTA2EFTp5nTJfFOt749qDEOCnEdpa8MFkFTp9x_y-tvbdpHcJGM-52kk4rau_DBzNtGcQ5gLX5n0f9Bc2aufxa8I4Fd-_Nrv8Alp4CEA
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb9MwFH_ahwRcEBugFQbLgR1AhC7-SBOkCaVrunbd0mq0WnfK7Ng-oKortFO1f5G_iufU2XYap10iR4qfop-f_T79HsCnBgrVyDDpK06lz3RD-0IbhVZrKFmDhEqVzpyzLOyM2MmYj9fgb3UXxqZVVmdieVCr68L6yOuoB3PcEChA68alRQxa7R-z377tIGUjrVU7DeHaLKjDstyYu-TR07dLNOfmh90Wrv0-Ie10eNTxXccBvyCMLnwWcWmUwP9XJNBKctqIBOUoUgsTS1vKLmIs1nEotDrgEncL0TErxIEKiGTKUKS7Dps2-IWHxGYzzQbnlVzglESr6q088tHu4i5oymnM6r-UmdtASBB-C6hNK3soJO813-c305m4XYrJ5IEQbL-Cl0579ZIVu23Bmp5uw_qpWL6GpHk5TC-6P1MvGQzO--PuWTJMPXwcWYfYd2_YSb3jfr_1tRw1ExwkWat8GWW9rH-RvYHRk6D1Fjam11O9Ax6VQVhIFUppSZNYskiL2ASmkKhDhUUN6hUgeeGKl9seGpPcGjEWwryEMLcQ5hbCGny-mzFbFe545NvdCuPcbeF5fs9wNfhyh_t_ab17nNYePEO2zU-7We89vLCzVv6cXdhY_LnRH1DDWciPjnU8uHpqbv0HwXcGPw
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT9tAEB5BONBLKaVVU6DyASEV1UnsfdhG4hAQJEIN4oCl9GTtetcqkDqocRTor--MH1A4gCpxG8uzK-_Mevab8fpbgJ0AF9Uw49o1gmmX28C6ymYGs1apeeBLY8pizuhMDmN-OhbjJRg0_8KgmX-RuYl1sskRu1Qkp53bGIbLDLE26oHHgmi3vHFbHFyZbDZZhhUpEJS3YCU-O-__KNlSRehiniMeZJ_VHywFi3i3bEibvGSHOny8QD2gztV5fqPuFmoy-WcBOlmDn82jV_tOrjvzQnfSP09YHV9hbO_gbQ1SnX6ltA5LNn8Py9_VYgP6h3eFXVzOrNMnNvLbS0S81hlhQKdS1r6D084ZTKfmWykdKhRUbsqLOKcCXv4B4pPji6OhWx_D4KY-Z4XLQ6Ezo9Cpxves0YIFoWICcUaaRZr4_ULOIxtJZU1PaAwhvo14qnrG8zU3GfsIrXya20_gMO3JVBupNXXtR5qHVkWZl6UaoZJM29BtbJ-kNUc5HZUxSShXIW8lpSkS8lZCJmrD1_sWNxU_xzO6W407k_pNnSWYXwkMtAjM2rB37-IX-_r8P8qb8IbkqnKzBa3i99xuI5Yp9Jd6nv4FM_n01g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BYTEWISE+APPROXIMATE+MATCHING%3A+THE+GOOD%2C+THE+BAD%2C+AND+THE+UNKNOWN&rft.jtitle=The+journal+of+digital+forensics%2C+security+and+law&rft.au=Harichandran%2C+Vikram+S&rft.au=Breitinger%2C+Frank&rft.au=Baggili%2C+Ibrahim&rft.date=2016-01-01&rft.pub=Association+of+Digital+Forensics%2C+Security+and+Law&rft.issn=1558-7215&rft.eissn=1558-7223&rft.volume=11&rft.issue=2&rft.spage=59&rft_id=info:doi/10.15394%2Fjdfsl.2016.1379&rft.externalDBID=HAS_PDF_LINK&rft.externalDocID=4204008451
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-7223&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-7223&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-7223&client=summon