OVST: online video stabilization with two-stage training transformer

Xing Wu*, Yimin Zhu, Han Zhang, Jun Song, Junfeng Yao, Dong Zhu, Quan Qian, Yike Guo

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Video stabilization aims to mitigate or eliminate the shake presented within video frames. Existing online video stabilization technologies rely on information from future frames, which may introduce a lag during real-time video stabilization. To surmount this hurdle, an online video stabilization model called OVST is proposed, which leverages solely historical video frames and enhanced by an attention mechanism. To simplify the complexity of model training and enhance robustness, a two-stage training strategy is proposed to decouple the fitting of real poses and the stabilization of virtual poses, and a hybrid stabilization loss with interframe soft constraints is designed, which effectively regulates the changes in camera poses between adjacent frames through interframe displacement, angular distortion, and cropping rate, thereby suppressing the distortion effects caused by excessive pose smoothing while balancing stability and cropping rate. Experiments demonstrate the superiority of the proposed OVST method over existing state-of-the-art techniques, achieving a stability metric of 0.8878 and a distortion metric of 0.9870.

Original languageEnglish
Pages (from-to)21909-21929
Number of pages21
JournalNeural Computing and Applications
Volume37
Issue number26
Early online date2 Aug 2025
DOIs
Publication statusPublished - Sept 2025

User-Defined Keywords

  • Attention mechanism
  • Image processing
  • Training strategy
  • Video stabilization

Fingerprint

Dive into the research topics of 'OVST: online video stabilization with two-stage training transformer'. Together they form a unique fingerprint.

Cite this