Multiagent cooperation and competition with deep reinforcement learning
dc.contributor.author
Tampuu, Ardi
dc.contributor.author
Matiisen, Tambet
dc.contributor.author
Kodelja, Dorian
dc.contributor.author
Kuzovkin, Ilya
dc.contributor.author
Korjus, Kristjan
dc.contributor.author
Aru, Juhan
dc.contributor.author
Aru, Jaan
dc.contributor.author
Vicente, Raul
dc.date.accessioned
2018-08-02T12:15:10Z
dc.date.available
2017-06-12T20:46:42Z
dc.date.available
2018-08-02T12:15:10Z
dc.date.issued
2017-04-05
dc.identifier.issn
1932-6203
dc.identifier.other
10.1371/journal.pone.0172395
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/130290
dc.identifier.doi
10.3929/ethz-b-000130290
dc.description.abstract
Evolution of cooperation and competition can appear when multiple adaptive agents share a biological, social, or technological niche. In the present work we study how cooperation and competition emerge between autonomous agents that learn by reinforcement while using only their raw visual input as the state representation. In particular, we extend the Deep Q-Learning framework to multiagent environments to investigate the interaction between two learning agents in the well-known video game Pong. By manipulating the classical rewarding scheme of Pong we show how competitive and collaborative behaviors emerge. We also describe the progression from competitive to collaborative behavior when the incentive to cooperate is increased. Finally we show how learning by playing against another adaptive agent, instead of against a hard-wired algorithm, results in more robust strategies. The present work shows that Deep Q-Networks can become a useful tool for studying decentralized learning of multiagent systems coping with high-dimensional environments.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
PLOS
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.title
Multiagent cooperation and competition with deep reinforcement learning
en_US
dc.type
Journal Article
dc.rights.license
Creative Commons Attribution 4.0 International
ethz.journal.title
PLoS ONE
ethz.journal.volume
12
en_US
ethz.journal.issue
4
en_US
ethz.journal.abbreviated
PLoS ONE
ethz.pages.start
e0172395
en_US
ethz.size
15 p.
en_US
ethz.version.deposit
publishedVersion
en_US
ethz.identifier.wos
ethz.identifier.scopus
ethz.publication.place
San Francisco, CA
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02000 - Dep. Mathematik / Dep. of Mathematics::02003 - Mathematik Selbständige Professuren::09453 - Werner, Wendelin (ehemalig) / Werner, Wendelin (former)
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02000 - Dep. Mathematik / Dep. of Mathematics::02003 - Mathematik Selbständige Professuren::09453 - Werner, Wendelin (ehemalig) / Werner, Wendelin (former)
ethz.relation.isNewVersionOf
handle/20.500.11850/126905
ethz.date.deposited
2017-06-12T20:47:14Z
ethz.source
ECIT
ethz.identifier.importid
imp593655657a51d17026
ethz.ecitpid
pub:193296
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2017-07-17T08:00:27Z
ethz.rosetta.lastUpdated
2024-02-02T05:26:32Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Multiagent%20cooperation%20and%20competition%20with%20deep%20reinforcement%20learning&rft.jtitle=PLoS%20ONE&rft.date=2017-04-05&rft.volume=12&rft.issue=4&rft.spage=e0172395&rft.issn=1932-6203&rft.au=Tampuu,%20Ardi&Matiisen,%20Tambet&Kodelja,%20Dorian&Kuzovkin,%20Ilya&Korjus,%20Kristjan&rft.genre=article&rft_id=info:doi/10.1371/journal.pone.0172395&
Files in this item
Publication type
-
Journal Article [131947]