To intelligently analyze and understand video content, a key step is to accurately perceive the motion of the interested objects in videos. To this end, the task of object tracking, which aims to determine the position and status of the interested object in consecutive video frames, is very important, and has received great research interest in the last decade. Although numerous algorithms have been proposed for object tracking in RGB videos, most of them may fail to track the object when the information from the RGB video is not reliable (e.g. in dim environment or large illumination change). To address this issue, with the popularity of dual-camera systems for capturing RGB and infrared videos, this paper presents a feature representation and fusion model to combine the feature representation of the object in RGB and infrared modalities for object tracking. Specifically, this proposed model is able to (1) perform feature representation of objects in different modalities by employing the robustness of sparse representation, and (2) combine the representation by exploiting the modality correlation. Extensive experiments demonstrate the effectiveness of the proposed method.
Scopus Subject Areas
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence