Coding schemes in neural networks learning classification tasks
Neural networks posses the crucial ability to generate meaningful representations of task-dependent features. Indeed, with appropriate scaling, supervised learning in neural networks can result in strong, task-dependent feature learning. However, the nature of the emergent representations is still u...
Saved in:
Published in | Nature communications Vol. 16; no. 1; pp. 3354 - 12 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
09.04.2025
Nature Publishing Group Nature Portfolio |
Subjects | |
Online Access | Get full text |
ISSN | 2041-1723 2041-1723 |
DOI | 10.1038/s41467-025-58276-6 |
Cover
Summary: | Neural networks posses the crucial ability to generate meaningful representations of task-dependent features. Indeed, with appropriate scaling, supervised learning in neural networks can result in strong, task-dependent feature learning. However, the nature of the emergent representations is still unclear. To understand the effect of learning on representations, we investigate fully-connected, wide neural networks learning classification tasks using the Bayesian framework where learning shapes the posterior distribution of the network weights. Consistent with previous findings, our analysis of the feature learning regime (also known as ‘non-lazy’ regime) shows that the networks acquire strong, data-dependent features, denoted as coding schemes, where neuronal responses to each input are dominated by its class membership. Surprisingly, the nature of the coding schemes depends crucially on the neuronal nonlinearity. In linear networks, an analog coding scheme of the task emerges; in nonlinear networks, strong spontaneous symmetry breaking leads to either redundant or sparse coding schemes. Our findings highlight how network properties such as scaling of weights and neuronal nonlinearity can profoundly influence the emergent representations.
Neural networks discover meaningful representations of the data through the process of learning. Here, the authors explore how these representations are affected by scaling the network output or modifying the activation functions. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 2041-1723 2041-1723 |
DOI: | 10.1038/s41467-025-58276-6 |