Probing machine-learning classifiers using noise, bubbles, and reverse correlation

Many scientific fields now use machine-learning tools to assist with complex classification tasks. In neuroscience, automatic classifiers may be useful to diagnose medical images, monitor electrophysiological signals, or decode perceptual and cognitive states from neural signals. However, such tools...

Full description

Saved in:

Bibliographic Details
Published in	Journal of neuroscience methods Vol. 362; no. 109297; p. 109297
Main Authors	Thoret, Etienne, Andrillon, Thomas, Léger, Damien, Pressnitzer, Daniel
Format	Journal Article
Language	English
Published	Elsevier B.V 01.10.2021 Elsevier
Subjects	Auditory models Automatic classifiers Cognitive science Computer science Data analysis Deep neural networks Interpretability Neuroscience Reverse correlation Sleep stages classification Data analysis Deep neural networks Automatic classifiers Sleep stages classification Interpretability Reverse correlation Auditory models
Online Access	Get full text
ISSN	0165-0270 1872-678X 1872-678X
DOI	10.1016/j.jneumeth.2021.109297

Cover

More Information
Summary:	Many scientific fields now use machine-learning tools to assist with complex classification tasks. In neuroscience, automatic classifiers may be useful to diagnose medical images, monitor electrophysiological signals, or decode perceptual and cognitive states from neural signals. However, such tools often remain black-boxes: they lack interpretability. A lack of interpretability has obvious ethical implications for clinical applications, but it also limits the usefulness of these tools to formulate new theoretical hypotheses. We propose a simple and versatile method to help characterize the information used by a classifier to perform its task. Specifically, noisy versions of training samples or, when the training set is unavailable, custom-generated noisy samples, are fed to the classifier. Multiplicative noise, so-called “bubbles”, or additive noise are applied to the input representation. Reverse correlation techniques are then adapted to extract either the discriminative information, defined as the parts of the input dataset that have the most weight in the classification decision, and represented information, which correspond to the input features most representative of each category. The method is illustrated for the classification of written numbers by a convolutional deep neural network; for the classification of speech versus music by a support vector machine; and for the classification of sleep stages from neurophysiological recordings by a random forest classifier. In all cases, the features extracted are readily interpretable. Quantitative comparisons show that the present method can match state-of-the art interpretation methods for convolutional neural networks. Moreover, our method uses an intuitive and well-established framework in neuroscience, reverse correlation. It is also generic: it can be applied to any kind of classifier and any kind of input data. We suggest that the method could provide an intuitive and versatile interface between neuroscientists and machine-learning tools. •The heuristics of black-box classifiers can be probed with noisy inputs.•The relevant features can be visualised in the input representation space.•The method applies to any kind of data such as 2D images or 1D time series.•It applies to any classifier such as deep neural networks, support vector machines, random forests.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0165-0270 1872-678X 1872-678X
DOI:	10.1016/j.jneumeth.2021.109297