Semi-Supervised and Long-Tailed Object Detection with CascadeMatch

Yuhang Zang, Kaiyang Zhou, Chen Huang, Chen Change Loy*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

7 Citations (Scopus)

Abstract

This paper focuses on long-tailed object detection in the semi-supervised learning setting, which poses realistic challenges, but has rarely been studied in the literature. We propose a novel pseudo-labeling-based detector called CascadeMatch. Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds. To avoid manually tuning the thresholds, we design a new adaptive pseudo-label mining mechanism to automatically identify suitable values from data. To mitigate confirmation bias, where a model is negatively reinforced by incorrect pseudo-labels produced by itself, each detection head is trained by the ensemble pseudo-labels of all detection heads. Experiments on two long-tailed datasets, i.e., LVIS and COCO-LT, demonstrate that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches—across a wide range of detection architectures—in handling long-tailed object detection. For instance, CascadeMatch outperforms Unbiased Teacher by 1.9 AP Fix on LVIS when using a ResNet50-based Cascade R-CNN structure, and by 1.7 AP Fix when using Sparse R-CNN with a Transformer encoder. We also show that CascadeMatch can even handle the challenging sparsely annotated object detection problem. Code: https://github.com/yuhangzang/CascadeMatch.

Original languageEnglish
Pages (from-to)987-1001
Number of pages15
JournalInternational Journal of Computer Vision
Volume131
Early online date6 Jan 2023
DOIs
Publication statusPublished - Apr 2023

Scopus Subject Areas

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

User-Defined Keywords

  • Object detection
  • Long-tailed learning
  • Semi-supervised learning

Fingerprint

Dive into the research topics of 'Semi-Supervised and Long-Tailed Object Detection with CascadeMatch'. Together they form a unique fingerprint.

Cite this