Identification of movie encoding neurons enables movie recognition AI

SignificanceUnderstanding how the brain recognizes visual stimuli has contributed to computer science–based static image recognition technologies, suggesting that information about brain-based strategies for encoding spatiotemporal transformations or “movies” may facilitate AI capabilities for recog...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 121; no. 48; p. e2412260121
Main Authors Hiramoto, Masaki, Cline, Hollis T.
Format Journal Article
LanguageEnglish
Published United States National Academy of Sciences 26.11.2024
Subjects
Online AccessGet full text
ISSN0027-8424
1091-6490
1091-6490
DOI10.1073/pnas.2412260121

Cover

More Information
Summary:SignificanceUnderstanding how the brain recognizes visual stimuli has contributed to computer science–based static image recognition technologies, suggesting that information about brain-based strategies for encoding spatiotemporal transformations or “movies” may facilitate AI capabilities for recognizing natural movie scenes. We found fundamental principles of how movies are encoded in the brain, including how image sequences and time are encoded, how visual experience-induced plasticity modifies movie detection features, and the underlying circuit motifs. We developed efficient movie recognition algorithms, MovieNet, based on the brain-based movie encoders that reduce data size and processing time. This study demonstrates that brain-based movie processing principles enable efficient machine learning. Natural visual scenes are dominated by spatiotemporal image dynamics, but how the visual system integrates “movie” information over time is unclear. We characterized optic tectal neuronal receptive fields using sparse noise stimuli and reverse correlation analysis. Neurons recognized movies of ~200-600 ms durations with defined start and stop stimuli. Movie durations from start to stop responses were tuned by sensory experience though a hierarchical algorithm. Neurons encoded families of image sequences following trigonometric functions. Spike sequence and information flow suggest that repetitive circuit motifs underlie movie detection. Principles of frog topographic retinotectal plasticity and cortical simple cells are employed in machine learning networks for static image recognition, suggesting that discoveries of principles of movie encoding in the brain, such as how image sequences and duration are encoded, may benefit movie recognition technology. We built and trained a machine learning network that mimicked neural principles of visual system movie encoders. The network, named MovieNet, outperformed current machine learning image recognition networks in classifying natural movie scenes, while reducing data size and steps to complete the classification task. This study reveals how movie sequences and time are encoded in the brain and demonstrates that brain-based movie processing principles enable efficient machine learning.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Contributed by Hollis T. Cline; received June 19, 2024; accepted September 12, 2024; reviewed by Colin J. Akerman and Cristopher M. Niell
ISSN:0027-8424
1091-6490
1091-6490
DOI:10.1073/pnas.2412260121