Open access
Date
2020-12Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
Recent developments in few-shot learning have shown that during fast adaption, gradient-based meta-learners mostly rely on embedding features of powerful pretrained networks. This leads us to research ways to effectively adapt features and utilize the meta-learner's full potential. Here, we demonstrate the effectiveness of hypernetworks in this context. We propose a soft row-sharing hypernetwork architecture and show that training the hypernetwork with a variant of MAML is tightly linked to meta-learning a curvature matrix used to condition gradients during fast adaptation. We achieve similar results as state-of-art model-agnostic methods in the overparametrized case, while outperforming many MAML variants without using different optimization schemes in the compressive regime. Furthermore, we empirically show that hypernetworks do leverage the inner loop optimization for better adaptation, and analyse how they naturally try to learn the shared curvature of constructed tasks on a toy problem when using our proposed training algorithm. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000465883Publication status
publishedPublisher
NeurIPSEvent
Organisational unit
09479 - Grewe, Benjamin / Grewe, Benjamin
Funding
186027 - Probabilistic learning in deep cortical networks (SNF)
Notes
Due to the Coronavirus (COVID-19) the conference was conducted virtually.
Accpeted version replaced with published version. Number of authors and author order has been changed.More
Show all metadata
ETH Bibliography
yes
Altmetrics