Show simple item record

dc.contributor.author
Pimentel, Tiago
dc.contributor.author
Valvoda, Josef
dc.contributor.author
Maudslay, Rowan H.
dc.contributor.author
Zmigrod, Ran
dc.contributor.author
Williams, Adina
dc.contributor.author
Cotterell, Ryan
dc.contributor.editor
Jurafsky, Dan
dc.contributor.editor
Chai, Joyce
dc.contributor.editor
Schluter, Natalie
dc.contributor.editor
Tetreault, Joel
dc.date.accessioned
2021-12-07T10:08:26Z
dc.date.available
2020-10-15T02:38:46Z
dc.date.available
2020-10-29T12:10:11Z
dc.date.available
2020-10-29T12:25:07Z
dc.date.available
2020-10-29T12:42:28Z
dc.date.available
2021-12-07T10:08:26Z
dc.date.issued
2020-07
dc.identifier.isbn
978-1-952148-25-5
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/446005
dc.identifier.doi
10.3929/ethz-b-000446005
dc.description.abstract
The success of neural networks on a diverse set of NLP tasks has led researchers to question how much these networks actually "know" about natural language. Probes are a natural way of assessing this. When probing, a researcher chooses a linguistic task and trains a supervised model to predict annotations in that linguistic task from the network's learned representations. If the probe does well, the researcher may conclude that the representations encode knowledge related to the task. A commonly held belief is that using simpler models as probes is better; the logic is that simpler models will identify linguistic structure, but not learn the task itself. We propose an information-theoretic operationalization of probing as estimating mutual information that contradicts this received wisdom: one should always select the highest performing probe one can, even if it is more complex, since it will result in a tighter estimate, and thus reveal more of the linguistic information inherent in the representation. The experimental portion of our paper focuses on empirically estimating the mutual information between a linguistic property and BERT, comparing these estimates to several baselines. We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research-plus English-totalling eleven languages. Our implementation is available in https://github.com/rycolab/info-theoretic-probing.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
Association for Computational Linguistics
en_US
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.title
Information-Theoretic Probing for Linguistic Structure
en_US
dc.type
Conference Paper
dc.rights.license
Creative Commons Attribution 4.0 International
ethz.book.title
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
en_US
ethz.pages.start
4609
en_US
ethz.pages.end
4622
en_US
ethz.version.deposit
publishedVersion
en_US
ethz.event
58th Annual Meeting of the Association-for-Computational-Linguistics (ACL 2020) (virtual)
ethz.event.location
Online
ethz.event.date
July 5-10, 2020
ethz.notes
Due to the Coronavirus (COVID-19) the conference was conducted virtually.
en_US
ethz.identifier.wos
ethz.publication.place
Stroudsburg, PA
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09682 - Cotterell, Ryan / Cotterell, Ryan
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::09682 - Cotterell, Ryan / Cotterell, Ryan
ethz.identifier.url
https://aclanthology.org/2020.acl-main.420
ethz.date.deposited
2020-10-15T02:38:58Z
ethz.source
WOS
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2020-10-29T12:10:22Z
ethz.rosetta.lastUpdated
2024-02-02T15:31:09Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Information-Theoretic%20Probing%20for%20Linguistic%20Structure&rft.date=2020-07&rft.spage=4609&rft.epage=4622&rft.au=Pimentel,%20Tiago&Valvoda,%20Josef&Maudslay,%20Rowan%20H.&Zmigrod,%20Ran&Williams,%20Adina&rft.isbn=978-1-952148-25-5&rft.genre=proceeding&rft.btitle=Proceedings%20of%20the%2058th%20Annual%20Meeting%20of%20the%20Association%20for%20Computational%20Linguistics
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record