TY - GEN
T1 - Improved Spatio-Temporal Convolutional Neural Networks for Traffic Police Gestures Recognition
AU - Wu, Zhixuan
AU - Ma, Nan
AU - CHEUNG, Yiu Ming
AU - Li, Jiahong
AU - He, Qin
AU - Yao, Yongqiang
AU - Zhang, Guoping
N1 - Funding Information:
ACKNOWLEDGMENT We really thank anonymous reviewer’s constructive suggestions. This work was supported in part by a grant from the National Natural Science Foundation of China (No. 61871038, No. 61931012), Beijing Natural Science Foundation (No. 4182022), Premium Funding Project for Academic Human Resources Development in Beijing Union University (No. BPHR2020AZ02) and key Projects of National Social Science Fund (No. 19AGL025).
PY - 2020/11
Y1 - 2020/11
N2 - In the era of artificial intelligence, human action recognition is a hot spot in the field of vision research, which makes the interaction between human and machine possible. Many intelligent applications benefit from human action recognition. Traditional traffic police gesture recognition methods often ignore the spatial and temporal information, so its timeliness in human computer interaction is limited. We propose a method that is Spatio-Temporal Convolutional Neural Networks (ST-CNN) which can detect and identify traffic police gestures. The method can identify traffic police gestures by using the correlation between spatial and temporal. Specifically, we use the convolutional neural network for feature extraction by taking into account both the spatial and temporal characteristics of the human actions. After the extraction of spatial and temporal features, the improved LSTM network can be used to effectively fuse, classify and recognize various features, so as to achieve the goal of human action recognition. We can make full use of the spatial and temporal information of the video and select effective features to reduce the computational load of the network. A large number of experiments on the Chinese traffic police gesture dataset show that our method is superior.
AB - In the era of artificial intelligence, human action recognition is a hot spot in the field of vision research, which makes the interaction between human and machine possible. Many intelligent applications benefit from human action recognition. Traditional traffic police gesture recognition methods often ignore the spatial and temporal information, so its timeliness in human computer interaction is limited. We propose a method that is Spatio-Temporal Convolutional Neural Networks (ST-CNN) which can detect and identify traffic police gestures. The method can identify traffic police gestures by using the correlation between spatial and temporal. Specifically, we use the convolutional neural network for feature extraction by taking into account both the spatial and temporal characteristics of the human actions. After the extraction of spatial and temporal features, the improved LSTM network can be used to effectively fuse, classify and recognize various features, so as to achieve the goal of human action recognition. We can make full use of the spatial and temporal information of the video and select effective features to reduce the computational load of the network. A large number of experiments on the Chinese traffic police gesture dataset show that our method is superior.
KW - Artificial intelligence
KW - Human action recognition
KW - Improved LSTM network
KW - Spatio-Temporal feature
KW - Traffic police gesture
UR - http://www.scopus.com/inward/record.url?scp=85105345235&partnerID=8YFLogxK
U2 - 10.1109/CIS52066.2020.00032
DO - 10.1109/CIS52066.2020.00032
M3 - Conference proceeding
AN - SCOPUS:85105345235
T3 - Proceedings - 2020 16th International Conference on Computational Intelligence and Security, CIS 2020
SP - 109
EP - 115
BT - Proceedings - 2020 16th International Conference on Computational Intelligence and Security, CIS 2020
PB - IEEE
T2 - 16th International Conference on Computational Intelligence and Security, CIS 2020
Y2 - 27 November 2020 through 30 November 2020
ER -