Interactive Language-Based Agents

Adolphs, Leonard

doi:10.3929/ethz-b-000613568

Show simple item record

dc.contributor.author

Adolphs, Leonard

dc.contributor.supervisor

Hofmann, Thomas

dc.contributor.supervisor

Sachan, Mrinmaya

dc.contributor.supervisor

Ciaramita, Massimiliano

dc.contributor.supervisor

Weston, Jason

dc.date.accessioned

2023-05-25T12:45:10Z

dc.date.available

2023-05-25T08:53:55Z

dc.date.available

2023-05-25T12:45:10Z

dc.date.issued

2023

dc.identifier.uri

http://hdl.handle.net/20.500.11850/613568

dc.identifier.doi

10.3929/ethz-b-000613568

dc.description.abstract

Deep neural network architectures have led to remarkable achievements in the area of natural language processing (NLP) in recent years. Through scaling up the model size and self-supervised pre-training on the vast amount of textual data available on the internet, generalization and complex reasoning capabilities have been unlocked, even when provided with a small number of specific examples. However, most progress in NLP has been made based on a static learning paradigm where models are trained once on a fixed dataset to learn a specific skill and remain fixed after that. In this thesis, we turn our attention to interactive agents for NLP, i.e., language-based models that engage with a dynamic environment or user. Across three different application areas, (i) text-based games, (ii) query reformulation, and (iii) conversation, we investigate and develop agents interacting with different forms of adaptive environments. The thesis is structured into three parts, reflecting the three application areas. In the first part, we develop a deep reinforcement learning (RL) agent for text-based games that generalizes across families of games that are similar in structure but with new objects and instructions. The second part focuses on query reformulation, which we approach from two angles. First, we consider the learning to search problem where an agent is trained to interact with an information retrieval (IR) system using natural language. Observing the IR component's results, it adapts the initial user query and collects an improved set of evidence documents. Within this setting, we develop two agents learning successful interactive search strategies: one model trained by pure reinforcement learning and the other through (self-) supervised learning. In the subsequent chapter, we turn our attention to neural retrieval models and develop agents for interactive query suggestions. To this end, we train a query decoder model that, given a point in the shared paragraph-query embedding space, generates the corresponding query in textual form. We employ this decoder to generate a synthetic dataset of directional query refinements, which we use to train a powerful reformulation model. In the last part of the thesis, we propose different approaches to developing conversational agents. We suggest modularizing the architecture of dialogue models to output intermediate text sequences on which subsequent modules are conditioned. First, we show that generating the knowledge output as an intermediate step before the dialogue response can increase knowledge utilization and factual correctness in open-domain dialogue. Next, we develop a single model that sequentially generates (i) a search engine query, (ii) a knowledge output, and (iii) a final response. We show that it outperforms previous state-of-the-art dialogue models on knowledge-grounded conversation and, applied to topical prompt completions, improves upon models with a vastly larger number of parameters. Finally, we explore improving dialogue models after deployment and propose an objective that allows iteratively training a language model on binary labeled examples of its generations.

en_US

dc.format

application/pdf

en_US

dc.language.iso

en

en_US

dc.publisher

ETH Zurich

en_US

dc.rights.uri

http://rightsstatements.org/page/InC-NC/1.0/

dc.subject

Natural Language Processing

en_US

dc.title

Interactive Language-Based Agents

en_US

dc.type

Doctoral Thesis

dc.rights.license

In Copyright - Non-Commercial Use Permitted

dc.date.published

2023-05-25

ethz.size

216 p.

en_US

ethz.code.ddc

DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science

en_US

ethz.identifier.diss

29105

en_US

ethz.publication.place

Zurich

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09462 - Hofmann, Thomas / Hofmann, Thomas

en_US

ethz.date.deposited

2023-05-25T08:53:56Z

ethz.source

FORM

ethz.eth

yes

en_US

ethz.availability

Open access

en_US

ethz.rosetta.installDate

2024-02-02T23:44:58Z

ethz.rosetta.lastUpdated

2024-02-02T23:44:58Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Interactive%20Language-Based%20Agents&rft.date=2023&rft.au=Adolphs,%20Leonard&rft.genre=unknown&rft.btitle=Interactive%20Language-Based%20Agents

Search print copy at ETH Library

Files in this item

Name:: PhD_Thesis_Leonard_Adolphs_Fin ...
Size:: 4.669Mb
Format:: Adobe PDF
Label:: Full text

Download

Publication type

Doctoral Thesis [30333]

Show simple item record

Research Collection

Search

Interactive Language-Based Agents Mendeley CSV RIS BibTeX

Files in this item

Publication type

Interactive Language-Based Agents

Mendeley

CSV

RIS

BibTeX