A vision-based human action recognition system for moving cameras through deep learning

Ming Jen Chang, Jih Tang Hsieh*, Chiung Yao Fang, Sei Wang Chen

*此作品的通信作者

研究成果: 書貢獻/報告類型會議論文篇章

10 引文 斯高帕斯(Scopus)

摘要

This study presents a vision-based human action recognition system using a deep learning technique. The system can recognize human actions successfully when the camera of a robot is moving toward the target person from various directions. Therefore, the proposed method is useful for the vision system of indoor mobile robots. The system uses three types of information to recognize human actions, namely, information from color videos, optical flow videos, and depth videos. First, Kinect 2.0 captures color videos and depth videos simultaneously using its RGB camera and depth sensor. Second, the histogram of oriented gradient features is extracted from the color videos, and a support vector machine is used to detect the human region. Based on the detected human region, the frames of the color video are cropped and the corresponding frames of the optical flow video are obtained using the Farnebäck method (https://docs.opencv=.org/3.4/d4/dee/tutorial-optical-flow.html). The number of frames of these videos is then unified using a frame sampling technique. Subsequently, these three types of videos are input into three modified 3D convolutional neural networks (3D CNNs) separately. The modified 3D CNNs can extract the spatiotemporal features of human actions and recognize them. Finally, these recognition results are integrated to output the final recognition result of human actions. The proposed system can recognize 13 types of human actions, namely, drink (sit), drink (stand), eat (sit), eat (stand), read, sit down, stand up, use a computer, walk (horizontal), walk (straight), play with a phone/tablet, walk away from each other, and walk toward each other. The average human action recognition rate of 369 test human action videos was 96.4%, indicating that the proposed system is robust and efficient.

原文英語
主出版物標題Proceedings of 2019 2nd International Conference on Signal Processing and Machine Learning, SPML 2019
發行者Association for Computing Machinery
頁面85-91
頁數7
ISBN(電子)9781450372213
DOIs
出版狀態已發佈 - 2019 11月 27
事件2nd International Conference on Signal Processing and Machine Learning, SPML 2019 - Hangzhou, 中国
持續時間: 2019 11月 272019 11月 29

出版系列

名字ACM International Conference Proceeding Series

會議

會議2nd International Conference on Signal Processing and Machine Learning, SPML 2019
國家/地區中国
城市Hangzhou
期間2019/11/272019/11/29

ASJC Scopus subject areas

  • 軟體
  • 人機介面
  • 電腦視覺和模式識別
  • 電腦網路與通信

指紋

深入研究「A vision-based human action recognition system for moving cameras through deep learning」主題。共同形成了獨特的指紋。

引用此