Show simple item record

dc.contributor.author
Adolphs, Leonard
dc.contributor.supervisor
Hofmann, Thomas
dc.contributor.supervisor
Sachan, Mrinmaya
dc.contributor.supervisor
Ciaramita, Massimiliano
dc.contributor.supervisor
Weston, Jason
dc.date.accessioned
2023-05-25T12:45:10Z
dc.date.available
2023-05-25T08:53:55Z
dc.date.available
2023-05-25T12:45:10Z
dc.date.issued
2023
dc.identifier.uri
http://hdl.handle.net/20.500.11850/613568
dc.identifier.doi
10.3929/ethz-b-000613568
dc.description.abstract
Deep neural network architectures have led to remarkable achievements in the area of natural language processing (NLP) in recent years. Through scaling up the model size and self-supervised pre-training on the vast amount of textual data available on the internet, generalization and complex reasoning capabilities have been unlocked, even when provided with a small number of specific examples. However, most progress in NLP has been made based on a static learning paradigm where models are trained once on a fixed dataset to learn a specific skill and remain fixed after that. In this thesis, we turn our attention to interactive agents for NLP, i.e., language-based models that engage with a dynamic environment or user. Across three different application areas, (i) text-based games, (ii) query reformulation, and (iii) conversation, we investigate and develop agents interacting with different forms of adaptive environments. The thesis is structured into three parts, reflecting the three application areas. In the first part, we develop a deep reinforcement learning (RL) agent for text-based games that generalizes across families of games that are similar in structure but with new objects and instructions. The second part focuses on query reformulation, which we approach from two angles. First, we consider the learning to search problem where an agent is trained to interact with an information retrieval (IR) system using natural language. Observing the IR component's results, it adapts the initial user query and collects an improved set of evidence documents. Within this setting, we develop two agents learning successful interactive search strategies: one model trained by pure reinforcement learning and the other through (self-) supervised learning. In the subsequent chapter, we turn our attention to neural retrieval models and develop agents for interactive query suggestions. To this end, we train a query decoder model that, given a point in the shared paragraph-query embedding space, generates the corresponding query in textual form. We employ this decoder to generate a synthetic dataset of directional query refinements, which we use to train a powerful reformulation model. In the last part of the thesis, we propose different approaches to developing conversational agents. We suggest modularizing the architecture of dialogue models to output intermediate text sequences on which subsequent modules are conditioned. First, we show that generating the knowledge output as an intermediate step before the dialogue response can increase knowledge utilization and factual correctness in open-domain dialogue. Next, we develop a single model that sequentially generates (i) a search engine query, (ii) a knowledge output, and (iii) a final response. We show that it outperforms previous state-of-the-art dialogue models on knowledge-grounded conversation and, applied to topical prompt completions, improves upon models with a vastly larger number of parameters. Finally, we explore improving dialogue models after deployment and propose an objective that allows iteratively training a language model on binary labeled examples of its generations.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
Natural Language Processing
en_US
dc.title
Interactive Language-Based Agents
en_US
dc.type
Doctoral Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2023-05-25
ethz.size
216 p.
en_US
ethz.code.ddc
DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science
en_US
ethz.identifier.diss
29105
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09462 - Hofmann, Thomas / Hofmann, Thomas
en_US
ethz.date.deposited
2023-05-25T08:53:56Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2024-02-02T23:44:58Z
ethz.rosetta.lastUpdated
2024-02-02T23:44:58Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Interactive%20Language-Based%20Agents&rft.date=2023&rft.au=Adolphs,%20Leonard&rft.genre=unknown&rft.btitle=Interactive%20Language-Based%20Agents
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record