Evolutionary Approaches for Adversarial Attacks on Neural Source Code Classifiers

As the prevalence and sophistication of cyber threats continue to increase, the development of robust vulnerability detection techniques becomes paramount in ensuring the security of computer systems. Neural models have demonstrated significant potential in identifying vulnerabilities; however, they...

Full description

Saved in:
Bibliographic Details
Published inAlgorithms Vol. 16; no. 10; p. 478
Main Authors Mercuri, Valeria, Saletta, Martina, Ferretti, Claudio
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.10.2023
Subjects
Online AccessGet full text
ISSN1999-4893
1999-4893
DOI10.3390/a16100478

Cover

More Information
Summary:As the prevalence and sophistication of cyber threats continue to increase, the development of robust vulnerability detection techniques becomes paramount in ensuring the security of computer systems. Neural models have demonstrated significant potential in identifying vulnerabilities; however, they are not immune to adversarial attacks. This paper presents a set of evolutionary techniques for generating adversarial instances to enhance the resilience of neural models used for vulnerability detection. The proposed approaches leverage an evolution strategy (ES) algorithm that utilizes as the fitness function the output of the neural network to deceive. By starting from existing instances, the algorithm evolves individuals, represented by source code snippets, by applying semantic-preserving transformations, while utilizing the fitness to invert their original classification. This iterative process facilitates the generation of adversarial instances that can mislead the vulnerability detection models while maintaining the original behavior of the source code. The significance of this research lies in its contribution to the field of cybersecurity by addressing the need for enhanced resilience against adversarial attacks in vulnerability detection models. The evolutionary approach provides a systematic framework for generating adversarial instances, allowing for the identification and mitigation of weaknesses in AI classifiers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1999-4893
1999-4893
DOI:10.3390/a16100478