MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation

We present MixANT, a novel architecture for stochastic long-term dense anticipation of human activities. While recent State Space Models (SSMs) like Mamba have shown promise through input-dependent selectivity on three key parameters, the critical forget-gate ($\textbf{A}$ matrix) controlling tempor...

Full description

Saved in:

Bibliographic Details
Main Authors	Wasim, Syed Talal, Suleman, Hamid, Zatsarynna, Olga, Naseer, Muzammal, Gall, Juergen
Format	Journal Article
Language	English
Published	14.09.2025
Subjects	Computer Science - Computer Vision and Pattern Recognition
Online Access	Get full text
DOI	10.48550/arxiv.2509.11394

Cover

More Information
Summary:	We present MixANT, a novel architecture for stochastic long-term dense anticipation of human activities. While recent State Space Models (SSMs) like Mamba have shown promise through input-dependent selectivity on three key parameters, the critical forget-gate ($\textbf{A}$ matrix) controlling temporal memory remains static. We address this limitation by introducing a mixture of experts approach that dynamically selects contextually relevant $\textbf{A}$ matrices based on input features, enhancing representational capacity without sacrificing computational efficiency. Extensive experiments on the 50Salads, Breakfast, and Assembly101 datasets demonstrate that MixANT consistently outperforms state-of-the-art methods across all evaluation settings. Our results highlight the importance of input-dependent forget-gate mechanisms for reliable prediction of human behavior in diverse real-world scenarios.
DOI:	10.48550/arxiv.2509.11394