Learning-from-Observation (LfO)



What is LfO?

From Demonstration to robot execution

In each demonstration, the sequence of tasks with skill parameters is obtained from the human demonstration. And every robot shares the same task sequence and reproduce the same task sequence regardless of the robot. Every Video of robot execution is played three times faster.


Demonstration



Nextage-Shadow



Fetch-Shadow



Fetch-Parallel

Demonstration



Nextage-Shadow



Fetch-Shadow



Fetch-Parallel

Demonstration



Nextage-Shadow



Fetch-Shadow



Fetch-Parallel

Demonstration



Nextage-Shadow



Fetch-Shadow


Other implemented skill librarys


Demonstration



Fetch-Parallel

Demonstration



Fetch-Parallel

Robust skill oparation based on sensor feedback

The skills use the state representation that is less suffer from the sim-to-real gap. For example, the force direction, not the force itself, is used as the state; the force is easily changed by the physical properties of the drawer, such as its mass and friction coefficient. As the result, the robot can successfully open both the empty drawer and the not-empty drawer.


The drawer is empty


The drawer is not empty

Comparison between manually programmed motion and reinforcement learning (RL) agent

The RL method robustly works when the estimated opening direction is inaccurate. In manually programmed motion, the robot draw the drawer along the estimated opening direction without force feedback. In RL agent, the robot modifies the drawing direction following the output of the RL. Even in the case where the estimation is so inaccurate, the RL agent can adjusts the motion to successfully open the drawer.



Manually programmed motion


RL agent