Reward learning from very few demonstrations

Publication:
Reward learning from very few demonstrations

Departments

Organizational Unit

Graduate School of Sciences and Engineering

School / College / Institute

Organizational Unit

GRADUATE SCHOOL OF SCIENCES AND ENGINEERING

Upper Org Unit

KU-Authors

Akgün, Barış

Eteke, Cem

Kebüde, Doğancan

Publication Date

2021

Type

Journal Article

Abstract

This article introduces a novel skill learning framework that learns rewards from very few demonstrations and uses them in policy search (PS) to improve the skill. The demonstrations are used to learn a parameterized policy to execute the skill and a goal model, as a hidden Markov model (HMM), to monitor executions. The rewards are learned from the HMM structure and its monitoring capability. The HMM is converted to a finite-horizon Markov reward process (MRP). A Monte Carlo approach is used to calculate its values. Then, the HMM and the values are merged into a partially observable MRP to obtain execution returns to be used with PS for improving the policy. In addition to reward learning, a black box PS method with an adaptive exploration strategy is adopted. The resulting framework is evaluated with five PS approaches and two skills in simulation. The results show that the learned dense rewards lead to better performance compared to sparse monitoring signals, and using an adaptive exploration lead to faster convergence with higher success rates and lower variance. The efficacy of the framework is validated in a real-robot settings by improving three skills to complete success from complete failure using learned rewards where sparse rewards failed completely.

Publisher

Ieee-Inst Electrical Electronics Engineers Inc

Subject

Robotics

Source

Ieee Transactions On Robotics

DOI

10.1109/TRO.2020.3038698

URI

https://doi.org/10.1109/TRO.2020.3038698
https://hdl.handle.net/20.500.14288/16512

Publication:
Reward learning from very few demonstrations

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

0

Downloads

Publication: Reward learning from very few demonstrations

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

0

Downloads

Publication:
Reward learning from very few demonstrations