Natural scenes reveal diverse representations of 2D and 3D body pose in the human brain
SignificanceThe visual processing of human bodies is important for social and cognitive functions. While previous studies have identified brain regions involved in detecting bodies and body parts, understanding how the brain processes three-dimensional (3D) spatial arrangements of body parts seen in...
Saved in:
| Published in | Proceedings of the National Academy of Sciences - PNAS Vol. 121; no. 24; p. e2317707121 |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
National Academy of Sciences
11.06.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0027-8424 1091-6490 1091-6490 |
| DOI | 10.1073/pnas.2317707121 |
Cover
| Summary: | SignificanceThe visual processing of human bodies is important for social and cognitive functions. While previous studies have identified brain regions involved in detecting bodies and body parts, understanding how the brain processes three-dimensional (3D) spatial arrangements of body parts seen in everyday life has remained a challenge. To address this challenge, we used 3D reconstruction algorithms to extract 3D body pose from a large set of natural images and analyzed human brain responses to these images. We found a distributed cortical network encoding body pose, with two-dimensional pose better represented in the lateral-occipital-temporal cortex (LOTC) and 3D pose in both LOTC and posterior superior temporal sulcus (pSTS). We highlight the importance of considering different pose representations for different visual tasks.
Human pose, defined as the spatial relationships between body parts, carries instrumental information supporting the understanding of motion and action of a person. A substantial body of previous work has identified cortical areas responsive to images of bodies and different body parts. However, the neural basis underlying the visual perception of body part relationships has received less attention. To broaden our understanding of body perception, we analyzed high-resolution fMRI responses to a wide range of poses from over 4,000 complex natural scenes. Using ground-truth annotations and an application of three-dimensional (3D) pose reconstruction algorithms, we compared similarity patterns of cortical activity with similarity patterns built from human pose models with different levels of depth availability and viewpoint dependency. Targeting the challenge of explaining variance in complex natural image responses with interpretable models, we achieved statistically significant correlations between pose models and cortical activity patterns (though performance levels are substantially lower than the noise ceiling). We found that the 3D view-independent pose model, compared with two-dimensional models, better captures the activation from distinct cortical areas, including the right posterior superior temporal sulcus (pSTS). These areas, together with other pose-selective regions in the LOTC, form a broader, distributed cortical network with greater view-tolerance in more anterior patches. We interpret these findings in light of the computational complexity of natural body images, the wide range of visual tasks supported by pose structures, and possible shared principles for view-invariant processing between articulated objects and ordinary, rigid objects. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Edited by Doris Tsao, University of California, Berkeley, CA; received October 19, 2023; accepted April 25, 2024 1H.Z. and Y.G. contributed equally to this work. |
| ISSN: | 0027-8424 1091-6490 1091-6490 |
| DOI: | 10.1073/pnas.2317707121 |