TY - JOUR
T1 - Frequency Feature Pyramid Network With Global-Local Consistency Loss for Crowd-and-Vehicle Counting in Congested Scenes
AU - Yu, Xiaoyuan
AU - Liang, Yanyan
AU - Lin, Xuxin
AU - Wan, Jun
AU - Wang, Tian
AU - Dai, Hong Ning
N1 - Funding information:
This work was supported in part by the National Key Research and Development Plan under Grant 2021YFE0205700; in part by the External Cooperation Key Project of Chinese Academy Sciences under Grant 173211KYSB20200002; in part by the Chinese National Natural Science Foundation Project under Grant 61876179 and Grant 61961160704; in part by the Science and Technology Development Fund of Macau under Grant 0008/2019/A1, Grant 0010/2019/AFJ, Grant 0025/2019/AKP, Grant 0004/2020/A1, and Grant 0070/2021/AMJ; in part by the Guangdong Provincial Key Research and Development Programme under Grant 2019B010148001; in part by the National Natural Science Foundation of China (NSFC) under Grant 62172046; and in part by the Special Project of Guangdong Provincial Department of Education in Key Fields of Colleges and Universities under Grant 2021ZDZX1063.
Publisher Copyright:
© 2022 IEEE.
PY - 2022/7
Y1 - 2022/7
N2 - Context prediction plays a crucial role in implementing autonomous driving applications. As one of important context-prediction tasks, crowd-and-vehicle counting is critical for achieving real-time traffic and crowd analysis, consequently facilitating decision-making processes for autonomous vehicles. However, the completion of crowd-and-vehicle counting also faces challenges, such as large-scale variations, imbalanced data distribution, and insufficient local patterns. To tackle these challenges, we put forth a novel frequency feature pyramid network (FFPNet) in this paper. Our proposed FFPNet extracts the multi-scale information by frequency feature pyramid module, which can tackle the issue of large-scale variations. Meanwhile, the frequency feature pyramid module uses different frequency branches to obtain different scale information. We also adopt the attention mechanism to strength the extraction of different scale information. Moreover, we devise a novel loss function, namely global-local consistency loss, to address the existing problems of imbalanced data distribution and insufficient local patterns. Furthermore, we conduct extensive experiments on six datasets to evaluate our proposed FFPNet. It is worth mentioning that we also construct a novel crowd-and-vehicle dataset (CROVEH), which is the only dataset that contains both crowd-and-vehicle annotations. The experimental results show that FFPNet achieves the best performance on different backbones, e.g., 52.69 mean absolute error (MAE) on P2PNet with FFP module. The codes are available at: https://github.com/MUST-AI-Lab/FFPNet.
AB - Context prediction plays a crucial role in implementing autonomous driving applications. As one of important context-prediction tasks, crowd-and-vehicle counting is critical for achieving real-time traffic and crowd analysis, consequently facilitating decision-making processes for autonomous vehicles. However, the completion of crowd-and-vehicle counting also faces challenges, such as large-scale variations, imbalanced data distribution, and insufficient local patterns. To tackle these challenges, we put forth a novel frequency feature pyramid network (FFPNet) in this paper. Our proposed FFPNet extracts the multi-scale information by frequency feature pyramid module, which can tackle the issue of large-scale variations. Meanwhile, the frequency feature pyramid module uses different frequency branches to obtain different scale information. We also adopt the attention mechanism to strength the extraction of different scale information. Moreover, we devise a novel loss function, namely global-local consistency loss, to address the existing problems of imbalanced data distribution and insufficient local patterns. Furthermore, we conduct extensive experiments on six datasets to evaluate our proposed FFPNet. It is worth mentioning that we also construct a novel crowd-and-vehicle dataset (CROVEH), which is the only dataset that contains both crowd-and-vehicle annotations. The experimental results show that FFPNet achieves the best performance on different backbones, e.g., 52.69 mean absolute error (MAE) on P2PNet with FFP module. The codes are available at: https://github.com/MUST-AI-Lab/FFPNet.
KW - Context prediction
KW - discrete cosine transformation
KW - frequency feature pyramid
KW - global-local consistency loss
UR - http://www.scopus.com/inward/record.url?scp=85132765567&partnerID=8YFLogxK
U2 - 10.1109/TITS.2022.3178848
DO - 10.1109/TITS.2022.3178848
M3 - Journal article
SN - 1524-9050
VL - 23
SP - 9654
EP - 9664
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
IS - 7
ER -