Probing machine-learning classifiers using noise, bubbles, and reverse correlation

Many scientific fields now use machine-learning tools to assist with complex classification tasks. In neuroscience, automatic classifiers may be useful to diagnose medical images, monitor electrophysiological signals, or decode perceptual and cognitive states from neural signals. However, such tools...

Full description

Saved in:
Bibliographic Details
Published inJournal of neuroscience methods Vol. 362; no. 109297; p. 109297
Main Authors Thoret, Etienne, Andrillon, Thomas, Léger, Damien, Pressnitzer, Daniel
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.10.2021
Elsevier
Subjects
Online AccessGet full text
ISSN0165-0270
1872-678X
1872-678X
DOI10.1016/j.jneumeth.2021.109297

Cover

More Information
Summary:Many scientific fields now use machine-learning tools to assist with complex classification tasks. In neuroscience, automatic classifiers may be useful to diagnose medical images, monitor electrophysiological signals, or decode perceptual and cognitive states from neural signals. However, such tools often remain black-boxes: they lack interpretability. A lack of interpretability has obvious ethical implications for clinical applications, but it also limits the usefulness of these tools to formulate new theoretical hypotheses. We propose a simple and versatile method to help characterize the information used by a classifier to perform its task. Specifically, noisy versions of training samples or, when the training set is unavailable, custom-generated noisy samples, are fed to the classifier. Multiplicative noise, so-called “bubbles”, or additive noise are applied to the input representation. Reverse correlation techniques are then adapted to extract either the discriminative information, defined as the parts of the input dataset that have the most weight in the classification decision, and represented information, which correspond to the input features most representative of each category. The method is illustrated for the classification of written numbers by a convolutional deep neural network; for the classification of speech versus music by a support vector machine; and for the classification of sleep stages from neurophysiological recordings by a random forest classifier. In all cases, the features extracted are readily interpretable. Quantitative comparisons show that the present method can match state-of-the art interpretation methods for convolutional neural networks. Moreover, our method uses an intuitive and well-established framework in neuroscience, reverse correlation. It is also generic: it can be applied to any kind of classifier and any kind of input data. We suggest that the method could provide an intuitive and versatile interface between neuroscientists and machine-learning tools. •The heuristics of black-box classifiers can be probed with noisy inputs.•The relevant features can be visualised in the input representation space.•The method applies to any kind of data such as 2D images or 1D time series.•It applies to any classifier such as deep neural networks, support vector machines, random forests.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0165-0270
1872-678X
1872-678X
DOI:10.1016/j.jneumeth.2021.109297