Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization

Turchetta, Matteo; Krause, Andreas; Trimpe, Sebastian

doi:10.1109/ICRA40945.2020.9197000

Show simple item record

dc.contributor.author

Turchetta, Matteo

dc.contributor.author

Krause, Andreas

dc.contributor.author

Trimpe, Sebastian

dc.date.accessioned

2020-10-29T10:26:05Z

dc.date.available

2020-10-24T06:45:51Z

dc.date.available

2020-10-29T10:26:05Z

dc.date.issued

2020

dc.identifier.isbn

978-1-7281-7395-5

en_US

dc.identifier.isbn

978-1-7281-7394-8

en_US

dc.identifier.isbn

978-1-7281-7396-2

en_US

dc.identifier.other

10.1109/ICRA40945.2020.9197000

en_US

dc.identifier.uri

http://hdl.handle.net/20.500.11850/447641

dc.description.abstract

In reinforcement learning (RL), an autonomous agent learns to perform complex tasks by maximizing an exogenous reward signal while interacting with its environment. In real world applications, test conditions may differ substantially from the training scenario and, therefore, focusing on pure reward maximization during training may lead to poor results at test time. In these cases, it is important to trade-off between performance and robustness while learning a policy. While several results exist for robust, model-based RL, the model-free case has not been widely investigated. In this paper, we cast the robust, model-free RL problem as a multi-objective optimization problem. To quantify the robustness of a policy, we use delay margin and gain margin, two robustness indicators that are common in control theory. We show how these metrics can be estimated from data in the model-free setting. We use multi-objective Bayesian optimization (MOBO) to solve efficiently this expensive-to-evaluate, multi-objective optimization problem. We show the benefits of our robust formulation both in sim-to-real and pure hardware experiments to balance a Furuta pendulum. © 2020 IEEE.

en_US

dc.language.iso

en

en_US

dc.publisher

IEEE

en_US

dc.title

Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization

en_US

dc.type

Conference Paper

dc.date.published

2020-09-15

ethz.book.title

2020 IEEE International Conference on Robotics and Automation (ICRA)

en_US

ethz.pages.start

10702

en_US

ethz.pages.end

10708

en_US

ethz.event

IEEE International Conference on Robotics and Automation (ICRA 2020)

en_US

ethz.event.location

Online

en_US

ethz.event.date

May 31 - August 31, 2020

en_US

ethz.notes

Due to the Coronavirus (COVID-19) the conference was conducted virtually.

en_US

ethz.identifier.scopus

85092704888

ethz.publication.place

Piscataway, NJ

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::03908 - Krause, Andreas / Krause, Andreas

ethz.leitzahl.certified

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::03908 - Krause, Andreas / Krause, Andreas

ethz.date.deposited

2020-10-24T06:46:22Z

ethz.source

SCOPUS

ethz.eth

yes

en_US

ethz.availability

Metadata only

en_US

ethz.rosetta.installDate

2020-10-29T10:26:17Z

ethz.rosetta.lastUpdated

2021-02-15T19:27:04Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Robust%20Model-free%20Reinforcement%20Learning%20with%20Multi-objective%20Bayesian%20Optimization&rft.date=2020&rft.spage=10702&rft.epage=10708&rft.au=Turchetta,%20Matteo&Krause,%20Andreas&Trimpe,%20Sebastian&rft.isbn=978-1-7281-7395-5&978-1-7281-7394-8&978-1-7281-7396-2&rft.genre=proceeding&rft_id=info:doi/10.1109/ICRA40945.2020.9197000&rft.btitle=2020%20IEEE%20International%20Conference%20on%20Robotics%20and%20Automation%20(ICRA)

Search print copy at ETH Library

Files in this item

Files	Size	Format	Open in viewer
There are no files associated with this item.

Publication type

Conference Paper [35402]

Show simple item record

Research Collection

Search

Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization Mendeley CSV RIS BibTeX

Files in this item

Publication type

Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization

Mendeley

CSV

RIS

BibTeX