Understanding Global Structure Relation via Reversible Visual State Space Model for Robust Cross-View Geo-Localization

  • Peiyuan Ma
  • , Yimin Fu*
  • , Jialin Lyu
  • , Zhunga Liu
  • *Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Cross-view geo-localization aims to match images captured from different views over the same geographic region. Existing methods typically determine spatial correlations between cross-view images according to the similarity of representations extracted from salient areas. However, such local appearance representations fail to capture the underlying structural relationships among the corresponding regions, which severely undermines the reliability of localization results in complex scenes. To address this problem, we propose a reversible visual state-space model to enhance the understanding of global structural relations inherent in images captured from different views. Specifically, we design a progressive spatial analysis approach, which incrementally integrates geometric dependencies exploited at different levels to improve the understanding of the global structure. Moreover, we introduce a reversible rotational scanning mechanism based on the 2D-selective-scan (SS2D) module to facilitate the exploitation of geometric dependencies between cross-view images. Finally, we adopt the cross-dimension interaction strategy to enrich the informativeness of representations in the common space, thereby reinforcing the discriminability of cross-view representations between different regions. Extensive experiments on the University-1652 and University160k-WX datasets demonstrate that the proposed method achieves state-of-the-art performance while maintaining robustness under complex environmental conditions.
Original languageEnglish
Title of host publicationUAVM 2025 - Proceedings of the 3rd International Workshop on UAVs in Multimedia
Subtitle of host publicationCapturing the World from a New Perspective, Co-located with MM 2025
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages42–46
Number of pages5
ISBN (Electronic)9798400718397
DOIs
Publication statusPublished - 31 Oct 2025

Publication series

NameMM: International Multimedia Conference
PublisherAssociation for Computing Machinery

User-Defined Keywords

  • Cross-view Geo-localization
  • Structure Relation
  • State Space Model

Fingerprint

Dive into the research topics of 'Understanding Global Structure Relation via Reversible Visual State Space Model for Robust Cross-View Geo-Localization'. Together they form a unique fingerprint.

Cite this