Teaching Bittle how to walk with Reinforcement Learning (RL)
Can a quadruped robot learn without being programmed how to?
In previous articles, we have talked about the quadruped robot Bittle and many interesting areas where Bittle has been used for research and exploration. Now, let’s examine how the Bittle can be programmed to do several tricks… in this particular example, walk.
How does Bittle move?
The Bittle is robotic dog with 9 degrees of freedom (8 servos with 2 each on a leg, and one servo on the neck). Teaching Bittle how to walk means programming the movements of the servos in a highly synchronized manner. The movements are programmed in Bittle’s firmware, and a collection of movements to accomplish one specific task is known as a gait. As an example, the program codes that allow Bittle to make different movements such as walk can be seen here.
However, programming gaits only takes us so far. It takes a lot of trial and effort to program a gait. As an example, if we wanted to teach Bittle how to climb stairs, we would need to think about how the servos need to move, and then test out codes on the desired stairs. It would be much easier if we could program gaits via computer simulation, and then transfer it to the Bittle.
How does Reinforcement Learning Help?
Here is where Reinforcement Learning (RL) comes into the picture. RL attempts to automate the process of reaching a goal by asking an agent to take actions to change the state of a system with the desire to maximize the likelihood of reaching the goal. The concept of closeness to the goal is defined by a reward function, that is the reward function encapsulates how close the state of the system is to the desired goal, and thereby maximizing the reward implies that we are reaching our goal. The agent keeps changing the state of the system to maximize the reward, often taking pitfalls in intermediary steps, till the reward function is steady and hopefully the goal is achieved.
Now, let us examine how RL can be used to train Bittle. There is an excellent YouTube video and linked code to describe the process… but I found the videos to be very deep in technical depth, and my attempt here is to mainly simplify the explanation of what we are trying to do. For those readers who are inclined to experiment and learn, I would recommend you to start simulating with the example here, which demonstrates how a bi-pedal robot can be taught to walk with the help of Reinforcement Learning. Thereafter, go through all the videos linked above.