Projects per year
Abstract
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation, showing that these two seemingly disparate approaches are actually tightly connected. Our framework is extremely flexible, allowing its subproblems to be solved with any suitable method, to any degree of accuracy. We derive a priori and computable a posteriori error bounds for all our methods, and numerically show that our a posteriori bounds are usually more accurate. Our numerical results also show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver.
Original language | English |
---|---|
Pages (from-to) | 254-278 |
Number of pages | 25 |
Journal | IMA Journal of Applied Mathematics |
Volume | 89 |
Issue number | 1 |
Early online date | 30 Nov 2023 |
DOIs | |
Publication status | Published - 31 Jan 2024 |
Funding
This work is supported in part by funds from EPSRC (EP/S026045/1, EP/T026693/1, EP/V026259/1) and the Leverhulme Trust (ECF-2019-478).
Funders | Funder number |
---|---|
EPSRC | EP/S026045/1, EP/T026693/1, EP/V026259/1 |
The Leverhulme Trust | ECF-2019-478 |
Keywords
- automatic differentiation
- bilevel optimization
- hyperparameter optimization
ASJC Scopus subject areas
- Applied Mathematics
Fingerprint
Dive into the research topics of 'Analyzing inexact hypergradients for bilevel learning'. Together they form a unique fingerprint.Projects
- 1 Active
-
Programme Grant: Mathematics of Deep Learning
Budd, C. (PI) & Ehrhardt, M. (CoI)
Engineering and Physical Sciences Research Council
31/01/22 → 30/01/27
Project: Research council