Locally Typical Sampling

Meister, Clara Isabel; Pimentel, Tiago; Wiher, Gian; Cotterell, Ryan

doi:10.1162/tacl_a_00536

Show simple item record

dc.contributor.author

Meister, Clara Isabel

dc.contributor.author

Pimentel, Tiago

dc.contributor.author

Wiher, Gian

dc.contributor.author

Cotterell, Ryan

dc.date.accessioned

2023-02-15T15:01:13Z

dc.date.available

2023-02-04T04:30:54Z

dc.date.available

2023-02-15T15:01:13Z

dc.date.issued

2023-01-12

dc.identifier.issn

2307-387X

dc.identifier.other

10.1162/tacl_a_00536

en_US

dc.identifier.uri

http://hdl.handle.net/20.500.11850/597055

dc.identifier.doi

10.3929/ethz-b-000597055

dc.description.abstract

Today’s probabilistic language generators fall short when it comes to producing coherent and fluent text despite the fact that the underlying models perform well under standard metrics (e.g., perplexity). This discrepancy has puzzled the language generation community for the last few years. In this work, we posit that the abstraction of natural language generation as a discrete stochastic process—which allows for an information-theoretic analysis—can provide new insights into the behavior of probabilistic language generators, for example, why high-probability texts can be dull or repetitive. Humans use language as a means of communicating information, aiming to do so in a simultaneously efficient and error-minimizing manner; in fact, psycholinguistics research suggests humans choose each word in a string with this subconscious goal in mind. We formally define the set of strings that meet this criterion: Those for which each word has an information content close to the expected information content, namely, the conditional entropy of our model. We then propose a simple and efficient procedure for enforcing this criterion when generating from probabilistic models, which we call locally typical sampling. Automatic and human evaluations show that, in comparison to nucleus and top-k sampling, locally typical sampling offers competitive performance (in both abstractive summarization and story generation) in terms of quality while consistently reducing degenerate repetitions.

en_US

dc.format

application/pdf

en_US

dc.language.iso

en

en_US

dc.publisher

Association for Computational Linguistics

en_US

dc.rights.uri

http://creativecommons.org/licenses/by/4.0/

dc.title

Locally Typical Sampling

en_US

dc.type

Journal Article

dc.rights.license

Creative Commons Attribution 4.0 International

ethz.journal.title

Transactions of the Association for Computational Linguistics

ethz.journal.volume

11

en_US

ethz.pages.start

102

en_US

ethz.pages.end

121

en_US

ethz.version.deposit

publishedVersion

en_US

ethz.identifier.wos

000922645500007

ethz.identifier.scopus

85146965296

ethz.publication.place

Cambridge, MA

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09682 - Cotterell, Ryan / Cotterell, Ryan

ethz.leitzahl.certified

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09682 - Cotterell, Ryan / Cotterell, Ryan

ethz.relation.isNewVersionOf

20.500.11850/588594

ethz.date.deposited

2023-02-04T04:30:55Z

ethz.source

SCOPUS

ethz.eth

yes

en_US

ethz.availability

Open access

en_US

ethz.rosetta.installDate

2023-02-15T15:01:16Z

ethz.rosetta.lastUpdated

2024-02-02T19:42:21Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Locally%20Typical%20Sampling&rft.jtitle=Transactions%20of%20the%20Association%20for%20Computational%20Linguistics&rft.date=2023-01-12&rft.volume=11&rft.spage=102&rft.epage=121&rft.issn=2307-387X&rft.au=Meister,%20Clara%20Isabel&Pimentel,%20Tiago&Wiher,%20Gian&Cotterell,%20Ryan&rft.genre=article&rft_id=info:doi/10.1162/tacl_a_00536&

Search print copy at ETH Library

Files in this item

Name:: tacl_a_00536.pdf
Size:: 710.1Kb
Format:: Adobe PDF
Label:: Full text (published version)

Download

Publication type

Journal Article [131759]

Show simple item record

Research Collection

Search

Locally Typical Sampling Mendeley CSV RIS BibTeX

Files in this item

Publication type

Locally Typical Sampling

Mendeley

CSV

RIS

BibTeX