Identification of movie encoding neurons enables movie recognition AI
SignificanceUnderstanding how the brain recognizes visual stimuli has contributed to computer science–based static image recognition technologies, suggesting that information about brain-based strategies for encoding spatiotemporal transformations or “movies” may facilitate AI capabilities for recog...
        Saved in:
      
    
          | Published in | Proceedings of the National Academy of Sciences - PNAS Vol. 121; no. 48; p. e2412260121 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        United States
          National Academy of Sciences
    
        26.11.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0027-8424 1091-6490 1091-6490  | 
| DOI | 10.1073/pnas.2412260121 | 
Cover
| Summary: | SignificanceUnderstanding how the brain recognizes visual stimuli has contributed to computer science–based static image recognition technologies, suggesting that information about brain-based strategies for encoding spatiotemporal transformations or “movies” may facilitate AI capabilities for recognizing natural movie scenes. We found fundamental principles of how movies are encoded in the brain, including how image sequences and time are encoded, how visual experience-induced plasticity modifies movie detection features, and the underlying circuit motifs. We developed efficient movie recognition algorithms, MovieNet, based on the brain-based movie encoders that reduce data size and processing time. This study demonstrates that brain-based movie processing principles enable efficient machine learning.
Natural visual scenes are dominated by spatiotemporal image dynamics, but how the visual system integrates “movie” information over time is unclear. We characterized optic tectal neuronal receptive fields using sparse noise stimuli and reverse correlation analysis. Neurons recognized movies of ~200-600 ms durations with defined start and stop stimuli. Movie durations from start to stop responses were tuned by sensory experience though a hierarchical algorithm. Neurons encoded families of image sequences following trigonometric functions. Spike sequence and information flow suggest that repetitive circuit motifs underlie movie detection. Principles of frog topographic retinotectal plasticity and cortical simple cells are employed in machine learning networks for static image recognition, suggesting that discoveries of principles of movie encoding in the brain, such as how image sequences and duration are encoded, may benefit movie recognition technology. We built and trained a machine learning network that mimicked neural principles of visual system movie encoders. The network, named MovieNet, outperformed current machine learning image recognition networks in classifying natural movie scenes, while reducing data size and steps to complete the classification task. This study reveals how movie sequences and time are encoded in the brain and demonstrates that brain-based movie processing principles enable efficient machine learning. | 
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Contributed by Hollis T. Cline; received June 19, 2024; accepted September 12, 2024; reviewed by Colin J. Akerman and Cristopher M. Niell  | 
| ISSN: | 0027-8424 1091-6490 1091-6490  | 
| DOI: | 10.1073/pnas.2412260121 |