Deep learning as optimal control problems: models and numerical methods

Martin Benning, Elena Celledoni, Matthias J. Ehrhardt, Brynjulf Owren, Carola-Bibiane Schönlieb

Research output: Contribution to journalArticlepeer-review

34 Citations (SciVal)
102 Downloads (Pure)

Abstract

We consider recent work of [18] and [9], where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.
Original languageEnglish
Pages (from-to)171-198
Number of pages28
JournalJournal of Computational Dynamics
Volume6
Issue number2
DOIs
Publication statusPublished - 1 Dec 2019

Keywords

  • math.OC
  • cs.LG
  • math.NA

Fingerprint

Dive into the research topics of 'Deep learning as optimal control problems: models and numerical methods'. Together they form a unique fingerprint.

Cite this