The development of autonomous mobile robots (AMRs) has brought with its requirements for intelligence and safety. Human action recognition (HAR) within AMR has become increasingly important because it provides interactive cognition between human and AMR. This study presents a full architecture for edge-artificial intelligence HAR (Edge-AI HAR) to allow AMR to detect human actions in real time. The architecture consists of three parts: a human detection and tracking network, a key frame extraction function, and a HAR network. The HAR network is a cascade of a DenseNet121 and a double-layer bidirectional long-short-term-memory (DLBiLSTM), in which the DenseNet121 is a pretrained model to extract spatial features from action key frames and the DLBiLSTM provides a deep two-directional LSTM inference to classify complicated time-series human actions. Edge-AI HAR undergoes two optimizations - ROS distributed computation and TensorRT structure optimization - to give a small model structure and high computational efficiency. Edge-AI HAR is demonstrated in two experiments using an AMR and is demonstrated to give an average precision of 97.58% for single action recognition and around 86% for continuous action recognition.
ASJC Scopus subject areas