Open access
Datum
2021-07-23Typ
- Conference Paper
ETH Bibliographie
yes
Altmetrics
Abstract
Deep ensembles aggregate predictions of diverse neural networks to improve generalisation and quantify uncertainty. Here, we investigate their behavior when increasing the ensemble members’ parameter size - a practice typically associated with better performance for single models. We show that under practical assumptions in the overparametrized regime far into the double descent curve, not only the ensemble test loss degrades, but common out-of-distribution detection and calibration metrics suffer as well. Reminiscent to deep double descent, we observe this phenomenon not only when increasing the single member’s capacity but also as we increase the training budget, suggesting deep ensembles can benefit from early stopping. This sheds light on the success and failure modes of deep ensembles and suggests that averaging finite width models perform better than the neural tangent kernel limit for these metrics. Mehr anzeigen
Persistenter Link
https://doi.org/10.3929/ethz-b-000501624Publikationsstatus
publishedVerlag
International Conference on Machine LearningKonferenz
Organisationseinheit
02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.09479 - Grewe, Benjamin / Grewe, Benjamin
Zugehörige Publikationen und Daten
Anmerkungen
Conference lecture held at the poster session 1 on July 23, 2021.ETH Bibliographie
yes
Altmetrics