Independent Learning in Markov Potential Games: New Insights and Constraints

Jordan, Philip

doi:10.3929/ethz-b-000646915

Show simple item record

dc.contributor.author

Jordan, Philip

dc.contributor.supervisor

Barakat, Anas

dc.contributor.supervisor

He, Niao

dc.date.accessioned

2023-12-12T15:02:45Z

dc.date.available

2023-12-12T07:37:32Z

dc.date.available

2023-12-12T15:02:45Z

dc.date.issued

2023

dc.identifier.uri

http://hdl.handle.net/20.500.11850/646915

dc.identifier.doi

10.3929/ethz-b-000646915

dc.description.abstract

Markov games offer a formal mathematical framework for modeling multi-agent reinforcement learning problems. Markov Potential Games (MPGs) represent a subclass of mixed cooperative and competitive Markov games for which finding Nash equilibria is tractable. The recently introduced class of constrained Markov Potential Games (CMPGs) generalizes MPGs to model the case where the agents’ reward maximization is subject to global constraints. Existing methods for learning Nash equilibria (NE) in MPGs can be categorized into centralized and independent learning algorithms. Notably, for learning ε-approximate NE, the best-known sample complexity achieved by centralized algorithms is significantly lower than for independent learning (O(ε⁻³) vs. O(ε⁻⁵)). Moreover, no converging independent learning algorithm is known for CMPGs. Nevertheless, whether these gaps are inherent is unknown, i.e., no provable separation between centralized and independent learning has been shown. Continuing on this quest, our contributions are twofold: (a) We propose a new playerwise policy gradient (PG) algorithm that requires coordination among players, however, improves on iteration and sample complexity regarding the dependence on the number of players m. The proposed method also improves over the m-dependence in the complexity of previously known centralized algorithms. (b) In the constrained case, we make progress on closing the gap between centralized and independent learning by providing an independent policy gradient algorithm for learning approximate constrained Nash equilibria in CMPGs. Inspired by contemporary optimization literature, our algorithm performs proximal-point-like updates augmented with a regularized constraint set. Each proximal step is solved inexactly using a stochastic switching gradient algorithm. Under some technical constraint qualification conditions, we establish convergence guarantees towards constrained approximate Nash equilibria. We perform simulations to illustrate our results in two real-world applications of NE-learning in CMPGs.

en_US

dc.format

application/pdf

en_US

dc.language.iso

en

en_US

dc.publisher

ETH Zurich

en_US

dc.rights.uri

http://rightsstatements.org/page/InC-NC/1.0/

dc.title

Independent Learning in Markov Potential Games: New Insights and Constraints

en_US

dc.type

Master Thesis

dc.rights.license

In Copyright - Non-Commercial Use Permitted

ethz.size

59 p.

en_US

ethz.publication.place

Zurich

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09729 - He, Niao / He, Niao

en_US

ethz.leitzahl.certified

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09729 - He, Niao / He, Niao

en_US

ethz.date.deposited

2023-12-12T07:37:33Z

ethz.source

FORM

ethz.eth

yes

en_US

ethz.availability

Open access

en_US

ethz.rosetta.installDate

2023-12-12T15:02:47Z

ethz.rosetta.lastUpdated

2024-02-03T07:59:50Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Independent%20Learning%20in%20Markov%20Potential%20Games:%20New%20Insights%20and%20Constraints&rft.date=2023&rft.au=Jordan,%20Philip&rft.genre=unknown&rft.btitle=Independent%20Learning%20in%20Markov%20Potential%20Games:%20New%20Insights%20and%20Constraints

Search print copy at ETH Library

Files in this item

Name:: Jordan_Philip.pdf
Size:: 846.6Kb
Format:: Adobe PDF
Label:: Full text

Download

Publication type

Master Thesis [2081]

Show simple item record

Research Collection

Search

Independent Learning in Markov Potential Games: New Insights and Constraints Mendeley CSV RIS BibTeX

Files in this item

Publication type

Independent Learning in Markov Potential Games: New Insights and Constraints

Mendeley

CSV

RIS

BibTeX