Show simple item record

dc.contributor.author
Jordan, Philip
dc.contributor.supervisor
Barakat, Anas
dc.contributor.supervisor
He, Niao
dc.date.accessioned
2023-12-12T15:02:45Z
dc.date.available
2023-12-12T07:37:32Z
dc.date.available
2023-12-12T15:02:45Z
dc.date.issued
2023
dc.identifier.uri
http://hdl.handle.net/20.500.11850/646915
dc.identifier.doi
10.3929/ethz-b-000646915
dc.description.abstract
Markov games offer a formal mathematical framework for modeling multi-agent reinforcement learning problems. Markov Potential Games (MPGs) represent a subclass of mixed cooperative and competitive Markov games for which finding Nash equilibria is tractable. The recently introduced class of constrained Markov Potential Games (CMPGs) generalizes MPGs to model the case where the agents’ reward maximization is subject to global constraints. Existing methods for learning Nash equilibria (NE) in MPGs can be categorized into centralized and independent learning algorithms. Notably, for learning ε-approximate NE, the best-known sample complexity achieved by centralized algorithms is significantly lower than for independent learning (O(ε⁻³) vs. O(ε⁻⁵)). Moreover, no converging independent learning algorithm is known for CMPGs. Nevertheless, whether these gaps are inherent is unknown, i.e., no provable separation between centralized and independent learning has been shown. Continuing on this quest, our contributions are twofold: (a) We propose a new playerwise policy gradient (PG) algorithm that requires coordination among players, however, improves on iteration and sample complexity regarding the dependence on the number of players m. The proposed method also improves over the m-dependence in the complexity of previously known centralized algorithms. (b) In the constrained case, we make progress on closing the gap between centralized and independent learning by providing an independent policy gradient algorithm for learning approximate constrained Nash equilibria in CMPGs. Inspired by contemporary optimization literature, our algorithm performs proximal-point-like updates augmented with a regularized constraint set. Each proximal step is solved inexactly using a stochastic switching gradient algorithm. Under some technical constraint qualification conditions, we establish convergence guarantees towards constrained approximate Nash equilibria. We perform simulations to illustrate our results in two real-world applications of NE-learning in CMPGs.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.title
Independent Learning in Markov Potential Games: New Insights and Constraints
en_US
dc.type
Master Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
ethz.size
59 p.
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09729 - He, Niao / He, Niao
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09729 - He, Niao / He, Niao
en_US
ethz.date.deposited
2023-12-12T07:37:33Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2023-12-12T15:02:47Z
ethz.rosetta.lastUpdated
2024-02-03T07:59:50Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Independent%20Learning%20in%20Markov%20Potential%20Games:%20New%20Insights%20and%20Constraints&rft.date=2023&rft.au=Jordan,%20Philip&rft.genre=unknown&rft.btitle=Independent%20Learning%20in%20Markov%20Potential%20Games:%20New%20Insights%20and%20Constraints
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record