Abstract
Combining Convolutional Neural Networks (CNNs) or Vision Transformers(ViTs) with Recurrent Neural Networks (RNNs) for spatiotemporal forecasting has yielded unparalleled results in predicting temporal and spatial dynamics. However, modeling extensive global information remains a formidable challenge; CNNs are limited by their narrow receptive fields, and ViTs struggle with the intensive computational demands of their attention mechanisms. The emergence of recent Mamba-based architectures has been met with enthusiasm for their exceptional long-sequence modeling capabilities, surpassing established vision models in efficiency and accuracy, which motivates us to develop an innovative architecture tailored for spatiotemporal forecasting. In this paper, we propose the VMRNN cell, a new recurrent unit that integrates the strengths of Vision Mamba blocks with LSTM. We construct a network centered on VMRNN cells to tackle spatiotemporal prediction tasks effectively. Our extensive evaluations show that our proposed approach secures competitive results on a variety of tasks while maintaining a smaller model size. Our code is available at https://github.com/yyyujintang/VMRNN-PyTorch.
Original language | English |
---|---|
Title of host publication | Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 |
Publisher | IEEE |
Pages | 5663-5673 |
Number of pages | 11 |
ISBN (Electronic) | 9798350365474 |
ISBN (Print) | 9798350365481 |
DOIs | |
Publication status | Published - Jun 2024 |
Event | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 - Seattle Convention Center, Seattle, United States Duration: 16 Jun 2024 → 22 Jun 2024 https://ieeexplore.ieee.org/xpl/conhome/10677511/proceeding (Conference proceeding) https://cvpr.thecvf.com/virtual/2024/index.html (Conference website) https://cvpr.thecvf.com/virtual/2024/calendar (Conference schedule) |
Publication series
Name | IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops |
---|---|
ISSN (Print) | 2160-7508 |
ISSN (Electronic) | 2160-7516 |
Conference
Conference | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 |
---|---|
Country/Territory | United States |
City | Seattle |
Period | 16/06/24 → 22/06/24 |
Internet address |
|
Scopus Subject Areas
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering