TY - GEN
T1 - Development of a Mimic Robot-Learning from Demonstration Incorporating Object Detection and Multiaction Recognition
AU - Hwang, Pin Jui
AU - Hsu, Chen Chien
AU - Wang, Wei Yen
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2020/5
Y1 - 2020/5
N2 - With the trends of DIY movements and maker economy, great needs for the applications of low volume automation are expected. Various approaches have been proposed by moving from low level coding to more intuitive methods for robots to perform tasks. In this article, a vision-based learning from demonstration (LfD) system is proposed, where a mimic robot is developed to learn from human demonstrations to manipulate a coffee maker. Employing two cameras as sensing devices, the proposed LfD system integrates object detection and multiaction recognition to build an action base, where object recognition is achieved by You only look once (YOLO) deep learning architecture, while multiaction recognition is carried out by an Inflated 3D ConvNet (I3D) deep learning network followed by a proposed statistically fragmented approach to enhance the action recognition results. Based on the sequential order of actions in the action base, tasks demonstrated by humans can be reproduced by the mimicking robot. To validate the proposed approach, manipulation of a coffee maker is adopted as an example, where the proposed system is capable of reproducing the tasks demonstrated by a human under different circumstances without knowledge in computer programming or robotics.
AB - With the trends of DIY movements and maker economy, great needs for the applications of low volume automation are expected. Various approaches have been proposed by moving from low level coding to more intuitive methods for robots to perform tasks. In this article, a vision-based learning from demonstration (LfD) system is proposed, where a mimic robot is developed to learn from human demonstrations to manipulate a coffee maker. Employing two cameras as sensing devices, the proposed LfD system integrates object detection and multiaction recognition to build an action base, where object recognition is achieved by You only look once (YOLO) deep learning architecture, while multiaction recognition is carried out by an Inflated 3D ConvNet (I3D) deep learning network followed by a proposed statistically fragmented approach to enhance the action recognition results. Based on the sequential order of actions in the action base, tasks demonstrated by humans can be reproduced by the mimicking robot. To validate the proposed approach, manipulation of a coffee maker is adopted as an example, where the proposed system is capable of reproducing the tasks demonstrated by a human under different circumstances without knowledge in computer programming or robotics.
UR - http://www.scopus.com/inward/record.url?scp=85083273323&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083273323&partnerID=8YFLogxK
U2 - 10.1109/MCE.2019.2956202
DO - 10.1109/MCE.2019.2956202
M3 - Article
AN - SCOPUS:85083273323
SN - 2162-2248
VL - 9
SP - 79
EP - 87
JO - IEEE Consumer Electronics Magazine
JF - IEEE Consumer Electronics Magazine
ER -