In each demonstration, the sequence of tasks with skill parameters is obtained from the human demonstration. And every robot shares the same task sequence and reproduce the same task sequence regardless of the robot. Every Video of robot execution is played three times faster.
The skills use the state representation that is less suffer from the sim-to-real gap. For example, the force direction, not the force itself, is used as the state; the force is easily changed by the physical properties of the drawer, such as its mass and friction coefficient. As the result, the robot can successfully open both the empty drawer and the not-empty drawer.
The RL method robustly works when the estimated opening direction is inaccurate. In manually programmed motion, the robot draw the drawer along the estimated opening direction without force feedback. In RL agent, the robot modifies the drawing direction following the output of the RL. Even in the case where the estimation is so inaccurate, the RL agent can adjusts the motion to successfully open the drawer.