Interpretable Machine Learning for Mitigating Feature-Driven Attacks

Recent studies have found that 43% of malware infections begin as malicious Microsoft Office documents in the form of Word or Excel file. While many techniques are proposed and are effective in the detection of malicious documents through the utilization of machine learning (ML) algorithms, bias in...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on technology and society Vol. 6; no. 2; pp. 220 - 230
Main Authors	Hartman, Corey M., Rimal, Bhaskar P.
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Analytical models Bias Codes Context modeling Data models Datasets Documents Explainable AI Feature extraction Heuristic algorithms interpretable machine learning Machine learning macros malicious documents Malware Mathematical models Predictions Predictive models Robot learning SHAP VBA XAI XGBoost
Online Access	Get full text
ISSN	2637-6415 2637-6415
DOI	10.1109/TTS.2025.3531780

Cover

More Information
Summary:	Recent studies have found that 43% of malware infections begin as malicious Microsoft Office documents in the form of Word or Excel file. While many techniques are proposed and are effective in the detection of malicious documents through the utilization of machine learning (ML) algorithms, bias in the datasets and the lack of insight into the decision as to why a document was flagged as malicious are problematic, as one key feature focused on by the ML model utilized may be relied on solely for the prediction that is made. By utilizing the SHAP algorithm (SHapley Additive exPlanation) and an ensemble of ML algorithms split into groups by their SHAP magnitude, where those features taking over the decision-making process of a model are split into their own feature set and are utilized in the training of a separate ML model, a voting classifier can be made to reduce this bias and reliance on a single or select few features. That allows for a more robust ML model for predicting malicious Office documents and presenting more insight into why a prediction was made by the classifier and a model that can let the user know when not enough data is present to predict with confidence. By utilizing this technique, an ensemble soft voting classifier was created that obtained 90.1% accuracy on a balanced dataset consisting of 250 malicious and 250 benign randomly selected Office documents and presents the user with a simple natural language statement that indicates the classification of the documents and why it was classified as a specific label.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2637-6415 2637-6415
DOI:	10.1109/TTS.2025.3531780