Despite the great success of the existing work in fine-grained visual categorization (FGVC), there are still several unsolved challenges, e.g., poor interpretation and vagueness contribution. To circumvent this drawback, motivated by the hypersphere embedding method, we propose a discriminative suprasphere embedding (DSE) framework, which can provide intuitive geometric interpretation and effectively extract discriminative features. Specifically, DSE consists of three modules. The first module is a suprasphere embedding (SE) block, which learns discriminative information by emphasizing weight and phase. The second module is a phase activation map (PAM) used to analyze the contribution of local descriptors to the suprasphere feature representation, which uniformly highlights the object region and exhibits remarkable object localization capability. The last module is a class contribution map (CCM), which quantitatively analyzes the network classification decision and provides insight into the domain knowledge about classified objects. Comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed method in comparison with state-of-the-art methods.
|Number of pages
|IEEE Transactions on Neural Networks and Learning Systems
|E-pub ahead of print - 15 Sept 2022
- Deep hypersphere embedding
- discriminative localization
- fine-grained visual categorization (FGVC)
- weakly supervised learning