Hierarchical Reinforcement Learning Explains Task Interleaving Behavior

Gebhardt, Christoph; Oulasvirta, Antti; Hilliges, Otmar

doi:10.1007/s42113-020-00093-9

Show simple item record

dc.contributor.author

Gebhardt, Christoph

dc.contributor.author

Oulasvirta, Antti

dc.contributor.author

Hilliges, Otmar

dc.date.accessioned

2021-08-12T11:34:23Z

dc.date.available

2021-01-29T13:12:31Z

dc.date.available

2021-01-29T14:04:01Z

dc.date.available

2021-06-30T07:24:15Z

dc.date.available

2021-08-11T15:14:45Z

dc.date.available

2021-08-12T11:34:23Z

dc.date.issued

2021-09

dc.identifier.issn

2522-087X

dc.identifier.issn

2522-0861

dc.identifier.other

10.1007/s42113-020-00093-9

en_US

dc.identifier.uri

http://hdl.handle.net/20.500.11850/466707

dc.identifier.doi

10.3929/ethz-b-000466707

dc.description.abstract

How do people decide how long to continue in a task, when to switch, and to which other task? It is known that task interleaving adapts situationally, showing sensitivity to changes in expected rewards, costs, and task boundaries. However, the mechanisms that underpin the decision to stay in a task versus switch away are not thoroughly understood. Previous work has explained task interleaving by greedy heuristics and a policy that maximizes the marginal rate of return. However, it is unclear how such a strategy would allow for adaptation to environments that offer multiple tasks with complex switch costs and delayed rewards. Here, we develop a hierarchical model of supervisory control driven by reinforcement learning (RL). The core assumption is that the supervisory level learns to switch using task-specific approximate utility estimates, which are computed on the lower level. We show that a hierarchically optimal value function decomposition can be learned from experience, even in conditions with multiple tasks and arbitrary and uncertain reward and cost structures. The model also reproduces well-known key phenomena of task interleaving, such as the sensitivity to costs of resumption and immediate as well as delayed in-task rewards. In a demanding task interleaving study with 211 human participants and realistic tasks (reading, mathematics, question-answering, recognition), the model yielded better predictions of individual-level data than a flat (non-hierarchical) RL model and an omniscient-myopic baseline. Corroborating emerging evidence from cognitive neuroscience, our results suggest hierarchical RL as a plausible model of supervisory control in task interleaving.

en_US

dc.format

application/pdf

en_US

dc.language.iso

en

en_US

dc.publisher

Springer

en_US

dc.rights.uri

http://creativecommons.org/licenses/by/4.0/

dc.subject

Computational modeling

en_US

dc.subject

Task interleaving

en_US

dc.subject

Hierarchical reinforcement learning

en_US

dc.subject

Bayesian inference

en_US

dc.subject

Hierarchical reinforcement learning model for task interleaving

en_US

dc.title

Hierarchical Reinforcement Learning Explains Task Interleaving Behavior

en_US

dc.type

Journal Article

dc.rights.license

Creative Commons Attribution 4.0 International

dc.date.published

2020-11-05

ethz.journal.title

Computational Brain & Behavior

ethz.journal.volume

4

en_US

ethz.journal.issue

3

en_US

ethz.journal.abbreviated

Comput Brain Behav

ethz.pages.start

284

en_US

ethz.pages.end

304

en_US

ethz.version.deposit

publishedVersion

en_US

ethz.grant

UFO: Semi-Autonomous Aerial Vehicles for Augmented Reality, Human-Computer Interaction and Remote Collaboration

en_US

ethz.identifier.scopus

85112452127

ethz.publication.place

New York, NY

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02658 - Inst. Intelligente interaktive Systeme / Inst. Intelligent Interactive Systems::03979 - Hilliges, Otmar / Hilliges, Otmar

en_US

ethz.leitzahl.certified

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02658 - Inst. Intelligente interaktive Systeme / Inst. Intelligent Interactive Systems::03979 - Hilliges, Otmar / Hilliges, Otmar

en_US

ethz.grant.agreementno

153644

ethz.grant.fundername

SNF

ethz.grant.funderDoi

10.13039/501100001711

ethz.grant.program

Projekte MINT

ethz.date.deposited

2021-01-29T13:12:38Z

ethz.source

FORM

ethz.eth

yes

en_US

ethz.availability

Open access

en_US

ethz.rosetta.installDate

2021-08-11T15:14:53Z

ethz.rosetta.lastUpdated

2022-03-29T11:02:19Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Hierarchical%20Reinforcement%20Learning%20Explains%20Task%20Interleaving%20Behavior&rft.jtitle=Computational%20Brain%20&%20Behavior&rft.date=2021-09&rft.volume=4&rft.issue=3&rft.spage=284&rft.epage=304&rft.issn=2522-087X&2522-0861&rft.au=Gebhardt,%20Christoph&Oulasvirta,%20Antti&Hilliges,%20Otmar&rft.genre=article&rft_id=info:doi/10.1007/s42113-020-00093-9&

Search print copy at ETH Library

Files in this item

Name:: Gebhardt2021_Article_Hierarchi ...
Size:: 3.983Mb
Format:: Adobe PDF
Label:: Full text (published version)

Download

Publication type

Journal Article [130999]

Show simple item record

Research Collection

Search

Hierarchical Reinforcement Learning Explains Task Interleaving Behavior Mendeley CSV RIS BibTeX

Files in this item

Publication type

Hierarchical Reinforcement Learning Explains Task Interleaving Behavior

Mendeley

CSV

RIS

BibTeX