A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning

In this paper, we initiate a cryptographically inspired theoretical study of detection versus mitigation of adversarial inputs produced by attackers on Machine Learning algorithms during inference time. We formally define defense by detection (DbD) and defense by mitigation (DbM). Our definitions co...

Full description

Saved in:

Bibliographic Details
Main Authors	Gluch, Greg, Goldwasser, Shafi
Format	Journal Article
Language	English
Published	10.07.2025
Subjects	Computer Science - Artificial Intelligence Computer Science - Cryptography and Security Computer Science - Learning
Online Access	Get full text
DOI	10.48550/arxiv.2504.20310

Cover

Abstract	In this paper, we initiate a cryptographically inspired theoretical study of detection versus mitigation of adversarial inputs produced by attackers on Machine Learning algorithms during inference time. We formally define defense by detection (DbD) and defense by mitigation (DbM). Our definitions come in the form of a 3-round protocol between two resource-bounded parties: a trainer/defender and an attacker. The attacker aims to produce inference-time inputs that fool the training algorithm. We define correctness, completeness, and soundness properties to capture successful defense at inference time while not degrading (too much) the performance of the algorithm on inputs from the training distribution. We first show that achieving DbD and achieving DbM are equivalent for ML classification tasks. Surprisingly, this is not the case for ML generative learning tasks, where there are many possible correct outputs for each input. We show a separation between DbD and DbM by exhibiting two generative learning tasks for which it is possible to defend by mitigation but it is provably impossible to defend by detection. The mitigation phase uses significantly less computational resources than the initial training algorithm. In the first learning task we consider sample complexity as the resource and in the second the time complexity. The first result holds under the assumption that the Identity-Based Fully Homomorphic Encryption (IB-FHE), publicly-verifiable zero-knowledge Succinct Non-Interactive Arguments of Knowledge (zk-SNARK), and Strongly Unforgeable Signatures exist. The second result assumes the existence of Non-Parallelizing Languages with Average-Case Hardness (NPL) and Incrementally-Verifiable Computation (IVC) and IB-FHE.
AbstractList	In this paper, we initiate a cryptographically inspired theoretical study of detection versus mitigation of adversarial inputs produced by attackers on Machine Learning algorithms during inference time. We formally define defense by detection (DbD) and defense by mitigation (DbM). Our definitions come in the form of a 3-round protocol between two resource-bounded parties: a trainer/defender and an attacker. The attacker aims to produce inference-time inputs that fool the training algorithm. We define correctness, completeness, and soundness properties to capture successful defense at inference time while not degrading (too much) the performance of the algorithm on inputs from the training distribution. We first show that achieving DbD and achieving DbM are equivalent for ML classification tasks. Surprisingly, this is not the case for ML generative learning tasks, where there are many possible correct outputs for each input. We show a separation between DbD and DbM by exhibiting two generative learning tasks for which it is possible to defend by mitigation but it is provably impossible to defend by detection. The mitigation phase uses significantly less computational resources than the initial training algorithm. In the first learning task we consider sample complexity as the resource and in the second the time complexity. The first result holds under the assumption that the Identity-Based Fully Homomorphic Encryption (IB-FHE), publicly-verifiable zero-knowledge Succinct Non-Interactive Arguments of Knowledge (zk-SNARK), and Strongly Unforgeable Signatures exist. The second result assumes the existence of Non-Parallelizing Languages with Average-Case Hardness (NPL) and Incrementally-Verifiable Computation (IVC) and IB-FHE.
Author	Gluch, Greg Goldwasser, Shafi
Author_xml	– sequence: 1 givenname: Greg surname: Gluch fullname: Gluch, Greg – sequence: 2 givenname: Shafi surname: Goldwasser fullname: Goldwasser, Shafi
BackLink	https://doi.org/10.48550/arXiv.2504.20310$$DView paper in arXiv
BookMark	eNrjYmDJy89LZWCQNDTQM7EwNTXQTyyqyCzTMzI1MNEzMjA2NOBk8HZUcC6qLCjJTy9KLMjITFYISC0qLkhNLsksS1XIz1PwzSzJTE8syQQyy4r1FFxSS0ByQF4mUC4xOSMzL1XBJzWxKC8zL52HgTUtMac4lRdKczPIu7mGOHvogq2NLyjKzE0sqowHWR8Ptt6YsAoA3qQ8Hg
ContentType	Journal Article
Copyright	http://creativecommons.org/licenses/by/4.0
Copyright_xml	– notice: http://creativecommons.org/licenses/by/4.0
DBID	GOX
DOI	10.48550/arxiv.2504.20310
DatabaseName	arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2504_20310
GroupedDBID	GOX
ID	FETCH-arxiv_primary_2504_203103
IEDL.DBID	GOX
IngestDate	Tue Jul 22 21:51:18 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-arxiv_primary_2504_203103
OpenAccessLink	https://arxiv.org/abs/2504.20310
ParticipantIDs	arxiv_primary_2504_20310
PublicationCentury	2000
PublicationDate	2025-07-10
PublicationDateYYYYMMDD	2025-07-10
PublicationDate_xml	– month: 07 year: 2025 text: 2025-07-10 day: 10
PublicationDecade	2020
PublicationYear	2025
Score	3.835158
SecondaryResourceType	preprint
Snippet	In this paper, we initiate a cryptographically inspired theoretical study of detection versus mitigation of adversarial inputs produced by attackers on Machine...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Artificial Intelligence Computer Science - Cryptography and Security Computer Science - Learning
Title	A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning
URI	https://arxiv.org/abs/2504.20310
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQSTU2Tk1MM04BZiRzI10TE-M03aRUYGGYlGSQnJiSZp6SlAhe5etn5hFq4hVhGsHEoADbC5NYVJFZBjkfOKlYH3S-FrAPZwzaQ8UMbCiANvP6R0AmJ8FHcUHVI9QB25hgIaRKwk2QgR_aulNwhESHEANTap4Ig7ejgnNRZUEJ5HDozGSFAMQGR4X8PAXfTMg5F0BmWbGegktqCXh5VJ5CJlAOvNYxVQF6DGq6KIO8m2uIs4cu2Pr4AshZEfEgl8WDXWYsxsAC7NGnSjAogMYdjVPMDCxTzYC1q3mahaFhsqVpKrASsUg2AHaIJBkkcJkihVtKmoHLCHQ5LejURwMZBpaSotJUWWCNWZIkBw42AI4Pb9Y
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Cryptographic+Perspective+on+Mitigation+vs.+Detection+in+Machine+Learning&rft.au=Gluch%2C+Greg&rft.au=Goldwasser%2C+Shafi&rft.date=2025-07-10&rft_id=info:doi/10.48550%2Farxiv.2504.20310&rft.externalDocID=2504_20310