Learning-from-Observation (LfO)

What is LfO?

From Demonstration to robot execution

In each demonstration, the sequence of tasks with skill parameters is obtained from the human demonstration. And every robot shares the same task sequence and reproduce the same task sequence regardless of the robot. Every Video of robot execution is played three times faster.

Place-on-plate demo

Demonstration

Nextage-Shadow

Fetch-Shadow

Fetch-Parallel

Shelf demo

Demonstration

Nextage-Shadow

Fetch-Shadow

Fetch-Parallel

Throw-away demo

Demonstration

Nextage-Shadow

Fetch-Shadow

Fetch-Parallel

Open fridge demo

Demonstration

Nextage-Shadow

Fetch-Shadow

Other implemented skill librarys

Wipe skill

Demonstration

Fetch-Parallel

Open drawer skill

Demonstration

Fetch-Parallel

Robust skill oparation based on sensor feedback

The skills use the state representation that is less suffer from the sim-to-real gap. For example, the force direction, not the force itself, is used as the state; the force is easily changed by the physical properties of the drawer, such as its mass and friction coefficient. As the result, the robot can successfully open both the empty drawer and the not-empty drawer.

The drawer is empty

The drawer is not empty

Comparison between manually programmed motion and reinforcement learning (RL) agent

The RL method robustly works when the estimated opening direction is inaccurate. In manually programmed motion, the robot draw the drawer along the estimated opening direction without force feedback. In RL agent, the robot modifies the drawing direction following the output of the RL. Even in the case where the estimation is so inaccurate, the RL agent can adjusts the motion to successfully open the drawer.

Manually programmed motion

RL agent