Bigger is not always better: The importance of human-scale language modeling for psycholinguistics

When trained to place high probability on a training corpus, neural network language models can learn a surprising amount about language. Recent work has demonstrated that large performance improvements can arise from simply increasing, i.e., scaling, the size of the corpora they are trained on and...

Full description

Saved in:

Bibliographic Details
Published in	Journal of memory and language Vol. 144; p. 104650
Main Authors	Wilcox, Ethan Gotlieb, Hu, Michael Y., Mueller, Aaron, Warstadt, Alex, Choshen, Leshem, Zhuang, Chengxu, Williams, Adina, Cotterell, Ryan, Linzen, Tal
Format	Journal Article
Language	English
Published	Elsevier Inc 01.10.2025
Subjects	Cognitive modeling Connectionist networks Language acquisition Language modeling Psycholinguistics Scaling Scaling Psycholinguistics Cognitive modeling Language acquisition Connectionist networks Language modeling
Online Access	Get full text
ISSN	0749-596X 1096-0821
DOI	10.1016/j.jml.2025.104650

Cover

More Information
Summary:	When trained to place high probability on a training corpus, neural network language models can learn a surprising amount about language. Recent work has demonstrated that large performance improvements can arise from simply increasing, i.e., scaling, the size of the corpora they are trained on and the number of parameters in those models. Accordingly, many contemporary systems are trained on trillions of words. While largely beneficial to performance on language applications, scaling has several downsides for both computational psycholinguistics and natural language processing research. We discuss the scientific challenges presented by the scaling paradigm, as well as the benefits that would result from language models that can learn from human-scale data. In the second half of this paper, we report on findings from a recent effort to bring about human-scale language model pretraining: the first iteration of the BabyLM Challenge, a shared task organized by the authors that invited participants to train a language model on 100 million words or less. The challenge produced several concrete best practices for practitioners interested in small-scale language modeling. For cognitive scientists, the challenge demonstrated that robust linguistic generalizations can be learned by models trained on a human-scale dataset, though this is not yet achieved through cognitively plausible mechanisms. Furthermore, it established a population of “BabyLMs” that are all effective at data-efficient language learning. Studying such models can help us identify hypotheses for the computational mechanisms that underlie human language acquisition. •Psycholinguistics benefits from computational models trained at human data scale.•We report on the BabyLM Challenge, an effort to train models at human scale.•BabyLM models achieve close to human-level performance on some tasks.•High language modeling performance is attainable with academic computational resources.•We identify actionable insights for human-scale language modeling.
ISSN:	0749-596X 1096-0821
DOI:	10.1016/j.jml.2025.104650