Open access
Date
2023-07Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
Large pre-trained language models are acknowledged to carry social biases towards different demographics, which can further amplify existing stereotypes in our society and cause even more harm. Text-to-SQL is an important task, models of which are mainly adopted by authoritative institutions, where unfair decisions may lead to catastrophic consequences. However, existing Text-to-SQL models are trained on clean, neutral datasets, such as Spider and WikiSQL. This, to some extent, cover up social bias in models under ideal conditions, which nevertheless may emerge in real application scenarios. In this work, we aim to uncover and categorize social biases in Text-to-SQL models. We summarize the categories of social biases that may occur in structured data for Text-to-SQL models. We build test benchmarks and reveal that models with similar task accuracy can contain social biases at very different rates. We show how to take advantage of our method- ology to uncover and assess social biases in the downstream Text-to-SQL task Show more
Permanent link
https://doi.org/10.3929/ethz-b-000652076Publication status
publishedExternal links
Book title
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)Pages / Article No.
Publisher
Association for Computational LinguisticsEvent
Organisational unit
09627 - Ash, Elliott / Ash, Elliott
More
Show all metadata
ETH Bibliography
yes
Altmetrics