A deep reinforcement learning algorithm to control a two-wheeled scooter with a humanoid robot

Jacky Baltes, Guilherme Christmann, Saeed Saeedvand*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


Balancing a two-wheeled scooter is considered a challenging task for robots, as it is a non-linear control problem in a highly dynamic environment. The rapid pace of development of deep reinforcement learning has enabled robots to perform complex control tasks. In this paper, a deep reinforcement learning algorithm is proposed to learn the steering control of the scooter for balancing and patch tracking using an unmodified humanoid robot. Two control strategies are developed, analyzed, and compared: a classical Proportional–Integral–Derivative (PID) controller and a Deep Reinforcement Learning (DRL) controller based on Proximal Policy Optimization (PPO) algorithm. The ability of the robot to balance the scooter using both approaches is extensively evaluated. Challenging control scenarios are tested at low scooter speeds, including 2.5, 5, and 10 km/h. Steering velocities are also varied, including 10, 20, and 40 rad/s. The evaluations include upright balance without disturbances, upright balance under disturbances, tracking sinusoidal path, and path tracking. A 3D model of the humanoid robot and scooter system is developed, which is simulated in a state-of-the-art GPU-based simulation environment as a training and test bed (NVidia's Isaac Gym). Despite the fact that the PID controller successfully balances the robot, better final results are achieved with the proposed DRL. The results indicate a 52% improvement on average in different speeds with better performance in path tracking control. Controller command evaluation on the real robot and scooter indicates the robot's complete capability to realize steering control velocities.

Original languageEnglish
Article number106941
JournalEngineering Applications of Artificial Intelligence
Publication statusPublished - 2023 Nov


  • Deep reinforcement learning
  • Humanoid robotics
  • PID control
  • Proximal policy optimization (PPO)
  • Two-wheeled vehicles

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering


Dive into the research topics of 'A deep reinforcement learning algorithm to control a two-wheeled scooter with a humanoid robot'. Together they form a unique fingerprint.

Cite this