Sensorimotor features of self-awareness in multimodal large language models

Self-awareness - the ability to distinguish oneself from the surrounding environment - underpins intelligent, autonomous behavior. Recent advances in AI achieve human-like performance in tasks integrating multimodal information, particularly in large language models, raising interest in the embodime...

Full description

Saved in:

Bibliographic Details
Main Authors	Varela, Iñaki Dellibarda, Romero-Sorozabal, Pablo, Torricelli, Diego, Delgado-Oleas, Gabriel, Serrano, Jose Ignacio, Sobrino, Maria Dolores del Castillo, Rocon, Eduardo, Cebrian, Manuel
Format	Journal Article
Language	English
Published	25.05.2025
Subjects	Computer Science - Artificial Intelligence Computer Science - Robotics
Online Access	Get full text
DOI	10.48550/arxiv.2505.19237

Cover

More Information
Summary:	Self-awareness - the ability to distinguish oneself from the surrounding environment - underpins intelligent, autonomous behavior. Recent advances in AI achieve human-like performance in tasks integrating multimodal information, particularly in large language models, raising interest in the embodiment capabilities of AI agents on nonhuman platforms such as robots. Here, we explore whether multimodal LLMs can develop self-awareness solely through sensorimotor experiences. By integrating a multimodal LLM into an autonomous mobile robot, we test its ability to achieve this capacity. We find that the system exhibits robust environmental awareness, self-recognition and predictive awareness, allowing it to infer its robotic nature and motion characteristics. Structural equation modeling reveals how sensory integration influences distinct dimensions of self-awareness and its coordination with past-present memory, as well as the hierarchical internal associations that drive self-identification. Ablation tests of sensory inputs identify critical modalities for each dimension, demonstrate compensatory interactions among sensors and confirm the essential role of structured and episodic memory in coherent reasoning. These findings demonstrate that, given appropriate sensory information about the world and itself, multimodal LLMs exhibit emergent self-awareness, opening the door to artificial embodied cognitive systems.
DOI:	10.48550/arxiv.2505.19237